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Examples of 3D graph- 
ics images that can be 
rendered with HP work- 
stations using the VESU- 
ALIZE fx graphics hard- 
ware. Renderings are 
courtesy of Dassault 
Systemes of Suresnes, 
France, Division of Bris- 
tol, England, and Para- 
metric Technology Cor- 
poration of Waltham, 
Massachusetts. 
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Technical computing today is increasingly dominated by design and analysis 
tasks that require high-performance workstation and software products. Some 
of the products described in this Issue address the needs of this emerging 
market. 

On the software side, we have the DirectModel 3D modeling toolkit and the 
HP implementation of the OpenGil^ graphics standard. The toolkit provides 
application developers with the capability to develop applications that can 
construct 3D models containing millions or billions of polygons. DirectModel 
is built on top of the HP OpenGL product OpenGL is a vendor-neutral, multi- 
platform. Industry-standard application programming Interface (API) for 
developing 2D and 3D visual applications. 

For running these applications, we have the HP Kayak PC-based workstation 
running the Windows- NT operating system. HP Kayak provides world-leading 
3D graphics performance typically found In high-end UNIX-' workstations. 
Much of the hardware architecture for HP Kayak Is based on the VISUALIZE 
fx^ graphics accelerator, which is designed to provide native acceleration for 
the OpenGL API. 

A common theme underlying the development of all these products is the 
desire to shorten the time to market Concurrent engineering was employed 
in the OpenGL project to achieve this goal Processes done in serial were 
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modified to be done in paratlel, shortening the product development 
cycle. Qudlity engineers at the HP Kobe Instrument Division reengi- 
neered their quality assurance process to deal with the time-to-market 
issue and still maintain high-quality released software. 

Wb have two articles about HP-UX workstations. One describes a fea- 
ture that allows multiple monitors to be configured as one contiguous 
viewing space, and the other discusses the challenges of adding the 
Penpheral Component Interconnect, or PCI, to HP B- class and C- class 
workstations. 

Information is the fuel that drives today's enterprises. Thus, we have 
three Bfticles that discuss the use of information to do such tasks as 
linking business manufacturing software to the factory floor, providing 
a knowledge database for suppofi personnel, and forecasting compo- 
nent demand in material planning. 

The article about HP VEE (Visual Engineering Environment) is an exam- 
ple of our new publishing paradigm of using the web to extend or com- 
plement what appears in the printed version of the Hewlett-Packard 
Journal. 

C. L Leath 
Managing Editor 
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In August we vnh have artictes about a 
15()-]SIHz-bai\d width membrane hydro- 
phone, units meai^uieinenl for optical 
instrujiients, and effi}rts to impro\^e the 
reUability of eerariiitr pm grid kuruy pack- 
aging and siufate-mount LEDs. We will 
also have aiticles from the HP Design 
TerhnologTp' Conference, the HP Com- 
pression Conference, aiid the HP Elee- 
Iroxiic and Assembly Conference, 
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Articles 



^y An API for J nlerfacin j f ateracU ve 
3D Applicattonsto Nifh-Speed 
Graphics Hardware 

Kevin T, Lefebvre and John M. Brown 
Ad iiiUGcliiciii:vn to rhe cuiicles in tliis 
issue tlmt describe the HP hardware and 
software products that impienient or 
support the OpenGL - speciTication, 

(^^S^ The Fast 'Break Program 



An Overview of the HP OpenGL 
Software Arch He dure 



Kevin T. Lefebvre, Robert J, Casey, Michael 
h Phelps, Courtney D. Goeltzenleuchter, 
and Donley B. Hoffman 
The features in the software component 
of the HP OpenGL produri that differ- 
entiate it from other QpeiiGL implemen- 
tations include performance, quality, and 
reliability. 




^^ TheDireclModelTootkit: 

Meetinf Ihe 3D Graphics Heeds 
of Technical Applications 

Brian E. Cripe and Thomas A. Gas kins 
Today's higlily complex meciianical design 
automation systems require a modelling 
toollcit for developing interacQve applica- 
tions capable of liandling 3D models con- 
taining millions or billions of polygons. 




An Overview ofthe VISUALIZE fx 
Graphics Accelerator Hardware 



KoeL D. Scott, Daniel M. Olsen, and Ethan W. 

Gannett 

Five custom integrated circuits make up 

tl\e iiiglvspeed VlSLi ALIZE fx family of 

graphics subsystems. 

(^o) Occlusion Cuffing 
(^3?) Fast Virtual Texturing 




HP Kayak: A PC Workstation virith 
Advanced Graphics Performance 



Ross A, Cunniff 

Graplucs perfonnance typically found 
in high-speed UNIX " workstations has 
l)een incorporated into a PC workstation 
njniiing the Windows'^ NT environment. 




^2P Concurrent Infltieerln^ In 

OpenGL's Prod yd Development 

Robert J. Casey and L. Leonard Lindstone 
The authors describe how the concepts 
of concurrent engineering helped the HP 
OpenGL project to achieve a shorter time 
to market and a retluction m rework. 



^^ Advanced Display Technologies 
on HP-UX Workstations 

Todd M. Spencer, Paul M. Anderson, and 

David Sweetser 

Recent versions of Ore HP-UX operating 

system contain features that ailovv^ users 

to create more vieu^ng space l^y contigur- 

ing nudtiple monitor into a single logical 

screen. 
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^^0 DeMveriniKMn HP B~Cbss and 
C^Class Workstations: A Case 
Study in the ChaHenf et of 
Interfacing with Industry 
Standards 

H\c L. Lewis, Erin A. Handferi. Nicholas J. 
In^egneri, and Glen T. Robinson 

Tile authors discuss some of tlie ^rhiillenges 
involved in incorporatiiig m\ iitdusixj' -stan- 
dard yo subsystein into [[P v^ orkstationii. 



Vm 




, g Enterprise Susmess 
m% to the factory Floor 



Kenn S. Jennyc 

HP Enteiprise Link Ls a nmldleware soft- 
ware pmduci Ihal tdJuvvs hiLsiness manage- 
ment applicalicins hi ext himge inffirma- 
tion with applications nnining on the 
fac*tory floor. 



^^y Knowledf e Harvestin|, 

Articulation, and l>elivery 

Kemal A. Delkand Dominique tahalx 

A knowledge-based softrwane tool is useri 
to j\plp HP .sutjport personnel provide 
custotiier suppoil. 



(jE^ Gloss^n^ 



A Thaorelkal Derivation of 
Helattonships belv^een Forecast 
Errors 



Jerry I. Shan 

A study of the errors associated with pre- 
dicting compcment replacement requiie- 
ments hi tlie materials plamiing process. 



^3 Strengthening Software Quality 
Assurance 



Mulsuhiko Asada and Ponf Mang Yan 
Eeenghieering a saftw ai'c quality assur- 
ance program to deal wirh shorter tin^e- 
to-market goals. 



O 



A Compiler for HP VEE 



Steven Green baum and Stanley Jefferson 

The authors describe a conipiler tet;hnol- 
ogy tJriat Is designed to hnprove lire exe- 
cution speed of HP VBE visual Ertgineer- 
ing Environment) progran^. 
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^1 The Hawlett-Packard Jounial Online 



Jittp://w^T\.hp,conVlipj/jounialJitml 

Wliat's new? 

^ The Previews section contains the 
following new articles: 

Techniques for Higher-Performance 
Boolean Equivalence Verification 

Theory and Design of CMOS HSTL l/Q 

Pads 

On-chip Cross Tatk Noise Model for 
Deep-Submicromoter ULSl Interconnect 

Testing with the HP 9H90 Mixed-Signal 
LSI Tester 

A low-Cost RF Multichlp Module 
Packaging Family 

Comparison of Finite-Difference and 
SPICE Tools forthermalModeting of the 
Effects of High -Power CPUs 

E-Mail Eegistration 

■ Use E-Mail Notification to register 
your e-niaiJ addres?? so that ytju can 
be notified when new articles are 
published. 
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An API for Interfacing Interactive 3D 
Applications to High-Speed Graphics 
Hardware 



Kevin T. Lefebvre 



John M. Brown 



The OpenGL'^' specification defines a software interface that can be 
implemented on a wide range of graphics devices ranging from simple 
frame buffers to fully hardware-accelerated geometry processors. 



o 



wEL^^, ^ Kevin T. Lefebvre 

^Mt^^^'- '^^ -^ stiLiur erigineei iii tlie 
^^|r^ ^^1 graphics products labora- 
^^^K^^-^^l toiy at the MP Workstation 
Syst<?iiis Division. Kevin l^efebvTe is responsi- 
hie for the f>ppnGL arch itt?ch ire and its itdpIp- 
nieiit*tTif)n iind ilplivery^ He cairve to HP in VMiy 
frtjm \he Apollti Systems Dtmicm. He hiiH a BS 
(tegree in inatliemaUcs ( 1976) from C'anu'gie- 
Mellon ITniversity. He waa boni in Pitti*tlekl, 
Massachusetts, is iTiarried aiid has tv.o chil- 
dren. Mis [lobbies include mnning. biking, and 
skiing. 



John M. Brown 

J'jhn jirifwri is a senior 
engineer in the graphics 
prnducts labnrjiTorj' of the 
HJ^ WfirkMation Sysrem-s Division. He is respoxi- 
sibh^ rt>r firaphirs a|s])1ica(ioti]ierft>nnance, 
Jolm tanie lo HP in li>SS, lie lioids a BSEE 
degree^ (19SQ] from the University of Kentucky. 




penGL is a specification for a software-to-hardware application 
programming interface, or API, that defines operatioris needed to produce 
interactive 3D applications. It is designed to be used on a wide range of 
grapMcs devices, including simple frame buffers ajid hardw^are-acceleraied 
geometiy proce^or systems. With design goals of efficiency and multiple 
platform suppoit, certain functions, such as windowing and input support, 
Ivtivv tiot Ikh^u (ic^fiiiccl in OpenGL. These tinsupported functions are included 
in sup)3ort libraiies outside the core OpenGL definition, 

OpenGL is targeted for use on a imnge of new^ graphics desires for both UNIX® - 
based and Windows'-' NT-based operating system platforms* These systems 
differ in both capabilities and performance. 

Early in the OpenGL program at HP, industry partnerships were established 
between tlie OpenGL R&D labs and key independent software vendors (ISVs) 
to ensure a higli-quatity, high-performance product that met the needs of 
these ISVs. These partnerships were also used to assist the ISVs in moving to 
the HP OpenGL product (see "The Fast Break Program" on page S). 

The various OpenGL articles in this issue describe the design philosophy and 
the implementation of the HP version of OpenGL and otlier graphics products 
associated with Oi:>enGL 
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H [Story of OpeiiGL 

OpenGL is a successor to Iris GL, a graphics Hbrai^- de\^el- 
oped by Silicon Graphics" Impmational tSGI). M^or 
changes have been made to the Iris GL specification in 
defining OpenGL. These changes have been aimed at 
making OpenGL a cleaner, more extensible architecture. 

With the goal of creating a single open graphics standard, 
the OpenGL Archite(^ture Review Board (ARBj was formed 
to define the specification and promote OpenGL in terms 
of fSV use and av^ailability of vendor implementations. 
The origina] .AJ^B members were SGI, hitel, Mici'osofl:^, 
Digiitd Equipment Corporation, and IBM Evans <& Suther- 
land, Intergraph, Sun, and HP were added more recently. 
For more information tm ciirrenL ARB ti\embers, OpeoGL 
lic^ensees, frequently-asl<:ed questions, aiid otlier 
ARB related information, Yisit the OpenGL web site al 
l"Lttp://wT4T\^opengl .org. 

The initial effort of the ARB was the 1.0 specification of 
OpcnCiL, which became available In 19£^2, Along with 
this spetafication was a series of t^onfonnance lests that 
licerLs<>es needed to pass before an Inipleinentation could 
be called ()pen(TL. Since then the ARB has added new 
iea tares anfl released a LI specification in 1995 (the HP 
implementation is based on L 1 ), Work is currently being 
done to define a 12 revision of the specification, 

HP Involvement in OpenQL 

HP became an OpcmGL licensee m 1995. We had Llie goal 
of delivering a native imj>Ienu^ntation of OpenGL thai 
would rnn cm hmciwrne and Koftw^arc^ that wtjidd provide 
OpenGL peifonnance leadersliip. 

Shortly after licensing OjienC JL, we established a relation- 
ship with a third party to provide an OpenGL implementa- 
lion on our existing set of graphics haiilware wliile we 
worked on a new generation of hardwaie that was l>etter 
suited for OpenGL semantics. The OpenGL provided by 
the tliird party used the underlying graphics hardware 
acceleration wliere possible. tJowever. 11 coulil not be 
considered an accelerated implementation of Open(iL 
because of featm*es lacking in the hardware. 

In August oi* 199(i we demonstrated our first native imple- 
mentation of OpeiifiL at Si^graph 9(l This implementation 
was fully functional and represented the software that 



w^ouid be shipped with the future OpenGL-based hard- 
ware. The implementation siipported xarioiis device driv- 
el^ including a software- based renderer The OpenGL de- 
veli^ment effort culminated in the announcement and 
delivery* of OpenGL-based systems in tlie ^1 of 1997. 

Software ImpletnentatJon 

In our implementation, we focused on the hardw^are's abil- 
\ty to accelerate m^or portions of the rendering pipeline. 
For the software, w^e focused on its ability to ensure that 
the hardware could mn at full performance. A fast graphics 
accelemtor is not needed if the driving software caimot 
keep die hardware busy. The resulting software architec- 
tLue and imp ten tenia tion was designed from a system 
viewi)oint. Decisions were based on system requirements 
to avoid fiveroptiniizing each individual compoiTent and 
still not achieve the desired residts. An ovenievv of the 
HP OpenGL soffw-are aiTlillecture is provided in the ar- 
ticle on i^age 9. Another software-related issue is provided 
in the article fm page -l-i, which discusses issues associ- 
ated with poiting a UNIX-based ( )i>enGL mipiementation 
to Window^s NT 

Hardware Systems 

The new graphics systems are able to support OpenGL, 
Starbase, PHIGS, and PEX rendering semantics in hard- 
war**. Being able* to stiijjjoil the f )jienGL API means that 
there is hardwaie support for accelerating ttie tnli reat.nre 
set of OpenGL instead of just liaving a simple frame buffer 
in w^hich all or most of the OpenGL features are imple- 
mented in software. These systems mv the WISIJAIIZE fx2, 
VISUALIZE fx4, and VISl'ALIZE fxfi grai>hit s ac^celerator 
produc;ts. These systems differ in the amount of graphics 
acceleration they jjrovide. the number of image planes, 
and the optioiud OpenCiL features xhey provide. In addi- 
tion to tlie base graphics boards, a texture mapping op- 
tion is available for the fx4 and fx6 accelerators. The 
article on page 28 provides an overview^ of the new 
graphics iiardware developed to support OpenGL. 

Engineerirtg Process 

To meet the required deliver^' dates of OpenC JL with a 
high level of confidence and quality, we used a new pro- 
cess to compress the time between first silicon and manu- 
factiuing release. The article on page 41 describes the 



May 1996 • TheHewleM-PBckafd JournaJ 



)Copr. 1949-1998 Hewlett-Packard Co. 



Tlie Fast-Break Program 



In basketball, a rapid offensive transition is called a fast- 
break. The fast-break program is about the transition game 
for OpenGL on HP systems. A key part of the HP transition to 
OpenGL is applications, because applications enable volume 
shipmsnts oi systems. Having the right applications is neces- 
seiyfara successful OpenGL product, but it is also important 
that the applications run with outstanding performance and 
reliability, Fr st-break is about both aspects — getting the appli- 
cations on HP systems and ensuring that they have outstanding 
performance. and reliability. 

Fast-break began by working with application developers in 
the early stages of the OpenGL program to understand their 
requirements for the HP OpenGL product. These requirements 
helped to drive the initial OpenGL product definition. 

As the program progressed, the Fast-break team developed a 
suite of tools that enabled detailed analysis of OpenGL appli- 
cations. Analysis of key applications was used to further refine 
our OpenGL product performance and functionality. Analysis 
also yielded a set of synthetic API benchmarks that repre- 
sented tlie behavior of key applications. These synthetic 
benchmarks enabled HP to perform early hands-on evaluation 
of the OpenGL product long before the actual applications 
were ported to HP 

Pre-porting laid the groundwork for the actual porting of appli- 
cations to HP's implementation of OpenGL The first phase of 



the porting took place during the OpenGL beta program. !n this 
program, the HP fast-break team worked closely with selected 
application developers to initiate the porting effort, A software- 
only implementation of the OpenGL product was used, which 
enabled the beta program to take place even before hardware 
was available. 

As hardware became available, the beta program was super- 
seded by the early access program. This program included the 
original beta participants and additional selected developers. 
In both the beta and early access programs, HP found that the 
homework done earlier by the fast-break team paid big divi- 
dends. Most applications were ported to HP in just 3 few days 
and, in some cases, just a few hours] 

Although not completely defect-free, these early versions of 
OpenGL were uniformly lugh-performance and high-quality 
products. By accelerating the application porting effort, HP 
was able to identify and resolve the few remaining issues 
before the product was officially released. 

The ongoing involvement of the fast-break team with the 
OpenGL product development teams helped HP do it right the 
first time by delivering a high-quality, high-performance imple- 
mentation of OpenGL and enabling rapid porting of key appli- 
cations to the HP product. 



engmeering proc'^ss we used to accelerate the lime to 
market for OpenGL. 

G ra ph I c s Middleware 

A fast graphics API Ls not always enough. Leading edge 
CAD modelling problems fai* exceed the interactive ca- 
pacity of graphical super workstations. For example, try 
spinning a complete CAD model of a Boeing 777 at 30 
frames per second on any system. 

Ml- 1 is needed is a new approach to sohing the render- 
ing problem of veiy large models. The goal is to trade 
off between frame rate, image qualit>\ and system cost. 



HP has mtroduced a toolkit for use by CAD ISVs to 
assist them in solving tiiis problem. The toolkit is called 
DirectModel and Is described on page 19. 



HP-UX Refes^s J0.20 snd btsrand HP-UX 1 lOQand iaterOn both 32- and B4-blt wMtgurs- 
mnsl on ail HP 9000 cotnpuwrs are Open Group UNIX 95 branded prmfi/ct5. 

UNIX isaregtswred trsdemarkofJfie Open Group. 

X/Gpsn tsa fegtstered irademark and the K device is a nsdefnark ufX/Op&n Company Um0: 

in the UK and other coenuws. 

Mtcrosoft is a U.S. r&gistBUBd irademgrk of Microsoft Corporation. 

Windows is a U.S.. registsrEd trndsmrk of Microsoft Corporsiion, 

Ssfium GrBphics ^nd OponGl ate rigistofBd trademarks ofSihcmGrspbicsinc in tf^ Unned 
States and otfrnrcauntriss 
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An Overview of the HP OpenGL' Software 
Architecture 



Kevin T. lefebvre 



Robert J, Casey 



laeU. Phelps 



Courtney D. Goeltzenleuchler 



Donley B. Hoffman 



OpenGL is a hardware-independent specification of a 3D graphics proc g 

interface. This specification has been implemented on many different vendors' 
platforms with different CPU types and graphics hardware, ranging from 
PC-based board solutions to high-performance workstations. 



T 



.he OpenGL API defines an interface (to graphics hardware) that deals 
entirely with rendering 3D primitives (for example, lines and polygons)* The 
HP implementation of the OpenGL standard does not pro\ide a one-to^ne 
mapping between APT ftinctions and hardware capabilities, Thtis, the software 
contponent of the flP OpenGL product 011s the gaps hy mappuig API ftinctions 
to OpenGL-capable systems. 

Since OpenGL is an indiistiy^-standtud giaphics API, mncii of the differentiating 
value HP delivers is in performance, quality, reliability, and time to market. 
The centraJ goal of the HP implementation is to ship more pcrfomiance and 
q 1 1 al ity m u r h so on er. 

What IS OpenGL? 

OpenGL differs fi oni other grapliics APIs, such as Starbase, PHIGS, and PEX 
(PHIGS extension in X), in that it is vertex-based as opposed to primitive- 
based. This means that OpenGL provides an interface for supplying a single 
vertex, sinface normal, color, or texture coordinate parameter in each calL 
Several of the calls between an OpenGL glBegin and glEnd pair define 
a primitive that is then rendered. Figure 1 shows a comparison of the 
different API call formats used to render a rectangle. In PHIGS a smgle call 
could render a primitive by referencing multiple vertices and their associated 
dat^ (such as nomials and color) as pai'ameters to the call, lliis difference in 
procedure calls per primitive {one versus eight for a shadec( triangle) posed 
a performance challenge for our implementation. 
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Figvirp 1 

Graphics API calf comparison. 

Starbase 
polygoii3d { . . . J ,- 




OpenGL 

gl Begin (GL_QUADS) ; 
glKonaal ( . 
glVertex ( . 
glNormal { . 

glVertex( . 
glNormal [ , 
glYertex ( , 
glUorma-l t ■ 
glVertex{ . 
glEnd { ) ; 



PEXlib 



FEXFillAreaSetWithData ( 



All OpenGL implementation consists of the follov^^ing 
elements: 

■ A rt^iKleiing lihiko^ (^L) that implenipnts th(^ OptniGL 
apc^cilicralion (thc^ rtyn tiering pipclino) 

■ A utility librar>' (GLLT) tliat bnplements usefiiJ LitiHty 
functions thai are layered on top of OpenGL (for 
example, surfaces, quadi^atics, and tessellation functions) 

■ An interface to the system s windowing package, includ- 
ing GLX for X Whidow Systems on the l^MX operating 
system and WGL for Microsoft Windows ^. 

Itfijilementatidn Goals 

The goals we defoied for the OpenGL program that helped 
to shape our miplementation w^ere to: 

■ Achieve and sustain long term price/perfomiance leader- 
ship for OpenGL applications nimiing on HP platfoni^s 

m Develop a scalable arcliitectnre that supports OpenGL 
on a wide range of IIP platibnns and graphics de\ices. 

The rest of this article will provide more details about 
oiu" OpenGL implementation and show how these goals 
afTected oui* system design. 

OpenGL API 

hi general OpenGL defines a tradidoiial 3D pipeline for 
rendering 3D priinitives. This pipeline takes 3D coordi- 
nates as hipul, trans foniis tlieni h^sed on orientaUon or 
\iewpoini. ligliLs the resulting croorrii nates, ;md then ren- 
ders them to the fianie buffer (Figure 2). 



To implement and control this pipelinCj the OpenGL API 
provides two classes of entry points. The first class is 
used to trreate 3D geometi'y as a combinai ion of simple 
primitives such as lines, triangles, and quadrilaterals. 
The entry points that make up this (i^Lss are refened to 
as the vertex API, or VAPI, fnnctk>ns. The second class, 
ctdled the state class, manipulates the OpenGL state used 
in the different rendering pipeline stages to define how to 
operate (transfonu, clip, and so on) on the printitive data- 

VAPl Class 

OpenGL contains a series of entry points that when used 
together pro\ide a powerful w^ay to biiild primitives. Tliis 
llt?xible inUTface allows an apphcation to provide primi- 
tive data directly from its private data structures rather 
than reriuirhig it to define stiiiciures in terms of w^hat the 
API requires, wliich may not l)e tlie format the application 
requires. 

Primitives are created from a sequence of vertices, Tliese 
vertices can have associated data such as color, surface 
normal and texture coordinates, Tliese vertices can be 
grouped together and assigned a t^ije, wliich defines how 
the vertices are connected and how to render the resulting 
primitive. 

The VAPI functions a\'ailable to define a primitive include 
g I Vertex (specify its coordinate), glNormal (define a smface 
norniiii at the coordinate), g I Color (assign a color to the 
coordinate), and several others. Each function has several 
forms that indicate the data type of the parameter (Ibr 
example, int, short, and fioat). whether the data is passed 
as a parameter or as a pointer to the data, and whetiier 
the data is one-, two-, three-, or four-dimensional. Alto- 
gether there are over 100 VAPt entry poinls that allow for 
maximiun application flexibility in defining primitives. 

The VAPI functions gIBegin and glEnd ai'e used to create 
groups of these vertices (and associated data). gIBegin 
takes a t^^De parametei' that deilnes the primitive type and 
a comit of vertices. Tlie type can be point, line, triangle, 
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triangle strip, quadrilateral, or polygon. Based on the type 
and count, the vertices are assembled together as prinii- 
tjves and sent dowTi the rendering pipeline. 

For added efficiency and to reduce the number of proce- 
dure calls required to render a primitive, vertex arrays 
were added to revision LI of the OpenGL specification. 
Vertex arrays allow an application to define a set of ver- 
tices and associated data before tlieir use. After die \'ertex 
data is defined, one or more renderiBg calls can be issued 
that reference this data \\ithout the additional calls of 
glBegin, glEnd, or any of the other VAPI calls. 

Finally, OpenGL provides several rendeiiiig routines that 
do not deal with 'W p^in■litiv€^s, but ratiier vvitii rectangular 
areas of pixels- From OpenGL, an application can read, 
copy, or draw pixels to oi- from any of the OpenGL 
image, depth, or textui"e buffers. 

State Class 

Tlie state class of API functioi^s manipulates the OpenGL 
state machine. The state machine defines how vertices 
are operated on as they pass through the rendering pipe- 
line. Tliere are over 100 functions in this class, each con- 
trolling a diri'ororit aspect of the pipeline. In OpenGL most 
state information is orthogonal to the type of primitive 
being operated on. For example, there is a single primitive 
color rather than a specific line color, polygon color, or 
IHiini color These state manipulation roiiiincs c*an be 
grouped as: 

• Coordinate transformation 

■ Coloring and lighting 

■ Clipping 

■ Rasterization 

■ Texture mapping 

■ Fog 

■ Modes and execution- 

Pjpaltfie 

Coordinate data (such as vertex, color, and surface nor- 
mal) can come directly from the application, indirectly 
from tiie application through th«^ use of evahiattn^s/ or 
from a stored display list that the application had prt^- 
vioiisly created. Tlie coordinates fiow into the pipeline as 

* Evaluatars are functions that denve cnordinate information based m paramedic carves 
or surfaces ind basic functions 



discrete points and are operated on (transforTnf?d) individ- 
ually. At a certain point in the pipeline the vertices are 
assembled into primitives, and tliey are operated on ai the 
primitiv^e level (for example, clipping). Nest, the primi- 
tives are rasterized into ft^agmenis in which operations 
like depth testing occtir on each fragment. The final result 
is pixels tiiat are written into the frame buffer, Tliis more 
complex OpenGL pipeline is shOAflrn in Figure 3. 

Conceptually, the transform stage takes application- 
specified object-space coordinates and trai^sfomis them 
to eye-space coordinates (tlie space that positions the 
object with respect to the viewer) with a model-view 
matrix. Next, the eye coordinates are projected with a 
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projection matrix, divided by the perspective^ and then 
transformed by the viewpoit matrix to get them to screen 
spare (relative to a window). This process is summarized 
in Figure 4. 

In the lighting stage, a color is computed for each vertex 
based on the lighting state. The UgJiting state consists of 
a number of lights, the tyi>e of each light (such as posi- 
tional or spotlight), various parameters of each light (for 
example, positiou^ pointing dii'ection, or color), and the 
material properties of the object being lit. The calculation 
takes into consideration, among other things, the light 
state and tlie distanc:e of the coordinalt^ to each liglit, re- 
sidting in a single color for the vertex. 

In rastemation, pixels are written based on the primitive 
type, and the pixel value to be written is based on various 
rasterization slates (such as textm'e mapping enabled, or 
polygon stipple enabled). OpenGL refers to the resulting 
pixel value as a fragment because in addition to the pixel 
value, there is also coverage, depth, antl other state Li\for- 
mation assciciated witii the fragment. The? flepth value is 
used 1 1> detentunti tiie \dsibility f]f the pixel as it intt^racts 
with existijig objec^ts in the frame buffer. WhWe the cover- 
age, or alpha, viOnc? blencis the jiixel value vtlth the exist- 
ing value in the ftaiiie buffer. 

Software Architecture 

One of the main design goals for the HP OpenGL software 
architectare was to niaxiniize performance where it 
would be most effective. For example, we decided to 
focus on reducing overhead to hardwai'e-accelerated 
paths and to base design decisions on apphcation use. 
minimizing the effort and cost required to support futme 
system hardware. The resultmg architecture is composed 
of two major components: a device-independent module 



and a device-specific module. A simple block diagram is 
shown in Figure 5- 

The dispatch component is responsible for handling 
OiyenGL API ctdis mxd sending tliem to the appropriate 
receiver. OpenGL can be in one of the following modes: 

■ Protocol mode in which API calls are packaged ujj and 
forwarded to a remote system for execution 

■ Display list creation mode in which API calls are stored 
in a display hst for later execution 

■ Direct nmdering mode in wliich API calls m^e intended 
for immediate rendering on the local screen. 



Figure 5 
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The primaiy applicatioB path of any importance is the 
UTunediale renciermg path. Wliile in direct rendering mode 
the perfomiajice of all functions is important but the per- 
formance of the VAPI calls is even more critical bet^use 
of the increased frequency of rendering calJs over other 
t\*pes of calls, like state setting. .\ny ovcriiead In transfer- 
ring application renderiitg c*ommands to the haiclware 
reduces overall performance sigrdficrantiy. See the **System 
Design Results'' section in this article on page 14 for a 
discttssion on some of these issues. 

The de\ice-independent module is the target for all the 
OpenGL stale manipulation calls, and m some situations, 
for VAPI calls such as display list or protocol generation. 
This module contains state management, all system con- 
trol logic, and a complete sofTw^are implementation of 
the OpenGL rendering pipeline up to the rasterization 
stage, which is used in situations where the hardware 
does not support an OpenGL feature. The device in- 
depentleni module is made up of several submodules, 
including: 

■ GLX (OpenGL GLX support module) for handhng win- 
dow system dependent components, including context 
management, X Window System mteractions, LUid proto- 
col generation 

■ SUM (system utilities module) for handling system 
dependent components, including system interactions, 
global state managemc^nl , and memory management 

■ (3CM (OpenGL control module) for handling OpenGL 
stale management, parameter chet^king, state ijiqiury 
support , and notiftt?ation of state changes to the appro 
Ijriale module 

■ PCM (pipeline control module) for handling gi aphics 
pipeline control, state vahdation, and (he softwtu-e 
ren deling pipeliBC 

■ DLM (display list module) for handlhig display list 
creation and execution. 

The device-specific module is basically an abstracted 
hardware interfa<^e that resides in a separate* shan*d li- 
brai^'. Based on what haiilwarf* in available, ilie device-in- 
dependen! c:ode dynamically loads the appropriate de- 
vice-specific module. In general the device-specific 
module is called only by the device-independeni module, 
never by f he API, anti converts the requests to hardware- 
specific operations (register loads, operation execute). In 



addition to a device-specific module for the \TSr.\LIZE 
fx series of graphi<'s hardware, there is a virtual memory 
driver device-specific module for handling OpenGL op- 
erations on GLX pixmaps (virtual-memoiy-based image 
buffers) or for rendering to hardware that does not sup- 
port OpenGL semantics. 

The final key component of the architecture is stream- 
lines. Streamlines are part of the device-specific modide 
but are unique in that they are associated directly with the 
APL On geometrj^-accelerated devices like the VISUALIZE 
ix series, the hardware can support the full set of VAPI 
calls. To mininme overhead and maximize performance, 
the calls are targeted to optimized routines tiiat conmiimi- 
cate dhectly v\ith the hmdware. hi many cases tiiese rou- 
tines are coded in PA KISU 1.1 or PA RISC 2.0 assembly 
language or C. At initialization time the appropriate rou- 
tines are loaded ui the dispatch table bcised on the system 
type and are dynamically selected at iim time. 

An important thing to understand about streamUnes is 
that they can only be called when the cm renl state is 
"clean" and the hardware suppoits the current rendering 
mode. .\n example of ''not clean" is when the viewing 
madix has been changed, and the hardw'are nt*eds to t)e 
updated with the current transformation matrix. Because 
the application can make sev^eral difTerent calls to manip- 
ulate the matrix, conipiiting the state based on the view- 
ing matrix ;uid ioaciing I lie hartiware is delen^^d luitil it is 
actually needed. For example, when a primitive is to be 
rendered (initiated via a g I Beg in Ctdl), the state is made 
clean (validated) by the device-inde[)endent code iiud sub- 
sequent VAPI calls can be dLspatche<l tlirectly to the 
StreamUnes, Another situation in wliich streamlines can- 
not be called is when the h;udwan* does not support a 
feature, sucli as texture mapping in the VISLLAUZE fx"^ 
display hardware. In tiiis situation the VAPI entry points 
do not target ti)e streamlines but rati\er the device-inde- 
pendent code that implements what is <::j.Llled a general 
path, or iu other temiSj a softwaie rendering pipeline. 

Three-Process IVIodel 

Under tiie X Window System on the liNIX operating sys- 
tem, the 0|)enGL architectm^ uses a three-process model 
to support tiie direct and mdirect semantics of OpenGL. 
Ill fiur implementation, we have leveraged our existing 
direct haixlware access (DHA) technolog,y to provide in- 
dustry-leading local rendering pertVjrmance. This has been 
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coupled with two distinct remote^ rpndering modes, making 
our OpenGL implementation one of the most flexible im- 
plenieiitati{jns in the industry. These rendering modes are 
based upon the three-process rendering model sliown in 
Figure 6. This model supports three rendering modes: 
direct, indirect , and virtual. 

Direct Rendering. Direct rendering through DMA provides 
the iiighest level of OpentiL peifo nuance and is used 
whenever an OpenGL application is connected to a local 
X ser\'er running on a workstation with VISUALIZE fx 
grapiiics haidware. Foi^ all but a few operations, the appli- 
cation process commimicates directly with the graphics 
hardware, bypassing the inteiprocess communication 
overhead between the application and the X server 

Indirect Rendering (Protocol). Indu'ect rendering is used 
pruuiuily tor nnntJie operation wlien the target X serv^er is 
running on a different workstation than the user applica- 
tion, in this mode, the OpenGL API library emits GLX 
protocol whicli is interpreted by a receiving X seiver that 
supports Ihe GLX extension. The receiving server can be 
HP, Sim Microsystems, Silicon Graphics® International, 
or any other X server tluil supports the GlJi sender exten- 
sion. In Uie HP OpenGL implementation, the receiving 
X server passes nearly iill CjLX protocol directly on to an 
OpenGL daemon process that uses DHA for maxinumi 
perlonnance. Note that immediate mode rendering per- 
fornimice through piotocol can be severely hmited by the 
time it takes to send geometric data over the network. 
However, when display lists are used, geometric data is 



cached in the OpenGL daemon and remot€i OpenGL ren- 
dering can be as fast or sometimes even faster than local 
DHA rendering- 

Virtual Rendering. As a value-added featui'e, HP OpenGL 
idscj providers a viiiuaJ GL rendering mode not available m 
other OpenGL implementations. Virtual rendering allows 
an OpenG!^ application to be displayt?d on any X server or 
X terminal even if the (iLX extension is not supported on 
that server. This is accomplished by rendering through the 
virtual memor^' dnver to local memory and then issuing 
the standard XPutlmage protocol to display images cm the 
target screen. lAlthough flexible^ virtual GL is tyi^ically the 
slowest of the OpenGL rendering modes. Howeverj virtual 
GL rendering peif onnance can be increased significantly 
by hunting the size of tlie output window 

System Design Results 

To deliver industry-leading OpenGL perfonn^mce, we 
conibmed grapiiics haidware, libraries, ajid drivers. The 
hai'dwaie is the core enabler of perfonnanc^cy. Altliougli 
the excellence of each part is important, the overall system 
design is even more so. How well the operating system, 
compilers, libraries, drivers, and hardv^'are til together 
IB the system design determines the overall result. We 
vvorked closely with teams in fom* HP R&D labs to opti- 
mize the system design, apply our design v allies to parti- 
doning the system, balance perfoniiance bottlenecks, and 
simplify the overall architecture and interfaces. The fol- 
lowing section describes some examples of applying onr 
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^ St (*in design principles to the most important aspects 

of 3D graphics appUcatioiis. 

tmproving OpenGL Application Pdrformance 

OpenGL required a radiccil change from Ihp existing 
(legat'y) IIP grapliirs APIs, hi analyzing the model for 
our legacy grapliic.^ APIs, we reahzed that the same model 
would ha%"e considerable o\ ertiead for t>penGL, which re- 
(Tuires many more j)iT>cedur€* rails. Figure 1 rom])ares the 
calls required to generate the same shaded quadrilateral. 

To have a competitive OpenGL, %ve needed to reduce or 
eliminate function calls and locking overhead. We did this 
with two systeuT design initiatives called ./?/a-/ pmcedure 
calis and implirit device lockimj. 

Fast Procedure Calls. Tw'o of om* laboratories (tlte Graph- 
ics Systems Lal>orator>' cind the Cupertino l^anguage Labo- 
rator>^) worked together to create a specification for a 
new, faster calling convention for making calls to shared 
library' components, Tliis reduced the cost to one-fourth 
the cost of the previous mechanism. 

OpenGL is a state machine. When the application calls an 
OpenGL hmrtion, different things ha[)pen depi^nding on 
the cuiTent slate. We also w*mted to supiH>rl difierent de- 
vices with var^iiTg degrees of support in the s<une OpenGL 
library- We needed a dynamic method of dispatching API 
fiuiction calls to the ctiJiTcct code to enable the appropriate 
fuuciionatity wUhout comproniising perftirriuuK-e. Given 
this requirenvent, a naive iniplenientation of OpenGL 
might define each of its API functions like Ute following: 

void glVertex3fv (const GLfloat *vi 



{ 



switch (context .whichPunct ion) 
{ 
case HW_ STREAMLINE: 

HW_STREAMLINE_glVertex3fv(v5 j 

br^ak ; 
caae GENERAL _ PATH; 

GEKERAL_PATH_glVertex3fv(v) j 

break,' 
caae GLX_PR0T0C01. ; 

GLX PROTOCOL glVertex3fv(v) ; 

break; 
case diSPLAY LIST; 

diSPLAY_LIST_glVertex3fv£v) ; 

break; 



> 



However, this is a verj^ in^practical implementation in 
terms of both performance and software maintainabiiity. 
We decided XlmX the most efficient method of aclMe\iiig 
this kind of dynamic dispatciiing was to retarget the API 
fuiicdon calls at their soiu'ce — the application code. .\ny 
call into a shared library* is really a call through a pointer, 
'Hie procedure name that the application caUs is associ- 
ated with a particular pointer. Conceptually, what we 
needed was a mechanism to manage i\\€* contents of 
those pointers. To accomplish this, we needed more assis- 
tance from the engineers in the compiler and linker 
groups. 

In simplified terms, tlie OjjenGL iibrar>* maintains a proce- 
dure link table. Each enriy^ in the procedure link table is 
associated with a paiticulai^ function muiie and is com- 
posed of two pointers. One points to the code that is to 
be called, and the other, the link table pointer, points to 
Uie table used by shared libraiy code (knov^ii <is FIC^ or 
posit ioii-mdependent code) to locate global data. When 
the compiler generates a call to an OpenGL fimction, it 
loatis tlie appropriate registers with the two fields in the 
associated procedure lirdt table enti"^' and then branches 
to the fimction. Since OpenGL controls the contents of 
the prof^edun^ link table, h can (4uuige the contents of 
these He Ids duiing execution. This allows OpenGL to 
choose 1 he apijropriate code based on Uic OpenGL state 
dynaiiiically. 

For example, assume that we have a graphics device 

that, except ff)r texture mapping, su|>ports tlie OpenGL 
pipeline in hardware. In this ca.se I he scheduling code 
will find textiu^e mapping enabled (meaning that the 
device cannot handle f extiu^e mapping) and choose the 
QENERAL_PATH_gtVert8x3fv code |)ath. which |ierfomis soft- 
Wiue textun^ mapi:>iug. The HW_STREAMLrNE_glVertex3fu 
code patlis are taken if texture mapping is not enabled. 

Ifnpiicrt Device Locking. Graphics devices are a shared 
system resource. As such, there nuist be some cotiti'oJ 
when an application has access to the graphics device so 
thai two applications are not attempting to use tlie device 
at tlie same time. Normally the operating system manages 
such shrU'cd leHOurces via stimdard operating system in- 
terfaces (open, close, read, write, i:iiul toctlj. 

However, to get the maximum performance possible 
for graphics applications, a user process will access the 
graphics device directly through our 31) API libraries, 
ratlier tluni use Ihe stajularr! operating system int(*rfaccs. 
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Tills means that before OpenGL^ the HP graphics libraries 
had to assiune the task of maiiaglng shared acc:esE* \o the 
gra[)hi£^s ttevic:e. 

Befoi'e OpeiiGLj we used a relatively hghtweight fast lock 
at tlie entry and exit of those library routmes that actually 
ac'-cressed the devic^e. W*ith the high frt^qiiency of functif)n 
calls in Open(iL. perf<;)rming t:his lock and unlock step 
for each function call would exact a severe perfomiance 
penalty, similar to the procedure call problem discussed 
earlier. 

To solve this problem, HP engineers invented a technique 
called implk'lt device locking. Mien a process tries to 
access the grapliics hardware and does not own the 
device, a virtual memory protection faidt exception will 
be generated. The kernel must detect that this protection 
fault was an attempted graphics de\ice access instead of 
a fault from trying to access something like an invalid 
address, a swapped out page, or from doing a copy on a 
wnte p^ge. 

The grapliics fault alerts the system that there is miother 
process trying to access the graphic:s de\ice. The kemel 
then makes sure that the graphics device context is saved, 
aiul the graphic^s context for the next process is restored. 
Aft er the graphics context switch is complete, the new 
process is allowed to continue with access to the device, 



and permission is taken away from all other processes- 
This allows the cmTent process tliat owns the* device to 
have zero overhead access. 

Tills method iwiioves the requirement diat the *3D graphics 
API libraiy must explicitly lock the graphics device while 
accessing it. This means llial the overhead iLssociated 
with device locking, which was an orcjer of magnitude 
more than with Staibase, is completely eliminated (see 
Figure 7). 

This dramatic improvement in peiiomiance is made pos- 
sible by impiovements in the HP-l'X" kernel and i^arefiil 
design of the grapliics hardware. The basic idea is that 
when multiple grapliics apphcatioiis are nmning, the 
HP-UX kernel will ensure that each apphcation gets its 
fair share of exclusive time to access the graphics device- 

OpenGL was not the only API to benefit from implicit 
locking. The geneiality of the design allowed us to use 
the same mechanism to ehminate the locking code from 
Starbase as w^ell. Keeping the whole system in mind 
whdi^ devek}])ing this teclmolugy allowed us to expand 
the benefit beyond the origind prt)biem — excessive over- 
head from locking for OpenGL. 



Figure 7 
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Hardware and Software Tradeoffs 

Keeping the whole picture in nund allowed us to make 
software and hardware trade-offs lo simpiity die system 
design. The criteria were based on performance critical- 
it^j frequency of use. system complexity, and factoi^" cost. 

For example, the hardware was designed to miderstand 
both OpenCiL and Starbase windows. OpenGL requires 
the \\indow origin to be in the lower left comer, v^ereas 
Starbase requires it to be in tbe upper left. Putting tlie 
intelligence in the hardware reduced the overall system 
complexity. 

Nearly all OpenGL features are hardware accelerated- Of 
course, all vertex API formats and dimensions are stream- 
lined and accelerated in hardware for maximmn priniiti%^e 
perftiniiance. Similarly, all fragment pipeline [)perations 
hatl to be suppoited m hardware because fragment opera- 
tions touch every pixel and softw^are performance would 
not he sufficient. To maximize piintitive performance, we 
also hardwaie-accelerated nearly eveiy geometry i:)ipeline 
feature. For example, all liglidng modes, fog modes, ar^d 
arbitrary clip planes are hardware-accelerated. Very few 
OpenGL features are pot hai'dware-accelerated. 

Based on infrequent use and the abiljt>^ to reasonably ac- 
celerate in sf.>ftware, we implement efl the folJowing func^- 
tions in softwaie: RasterPos, Selection, Feedback, indexed 
Lighting, and Indexed Fog. Inlretiuent use and factory cost 
also encouraged us lo ijnijlemenl accumulation buffer 
sitppoit in siiftware, (Accumulation is an operation that 
blends data between the frame buffer and the accumula- 
tion buffer, allowing effects like motion blur.) 

State Change 

Through systems design wo achieved dramatic results in 
application peiibnnance by fociu^ing on the design for 
0])enGL state change oi>eraUons. 

Application grapliies performance is a function of both 
primitive and state change (attributes) performance. We 
have designed onr OpenGL imt)Ietnentarion to maximize 
[>riniilive performance and minimize tiie costs of state 
changes. 

State changes include all the fimction calls that modify the 
OpenGL modal st<ite, incltiding coordinate ti^ansfoniiations, 
ligliliHg state, clipping state, rastcjfiziition .state, aiid texttu'e 
state, hitate ehattge does not include primitive callSj pixel 



Q|)erations, display list calls, or current state calls. Cur- 
rent Slate encompasses all the OpenGL calls that can 
occur either inside or outside g!Begin(l and glEndd pairs 
(for example, glCoiodi glNormall), glVertex(l). 

There are two classes of state changes- fragment pipeline 
and geometry- pipeline. Fragment pipeline state changes 
control tlio back end. or i^asterization stage, of the graphics 
pipeline. This state includes the depth test enable (z-btiEfer 
hidden surface remo\^) and tire line stipple definition 
(patterned lines such as dash or dot). Geometiy pipeline 
state changes control the front end of tlie graphics pipe- 
line. Tills state includes transfonnation matrices, lighting 
parameters, and front and hack culling parameter's. Frag- 
ment pipeline state changes are generally less costly tlian 
geometry pipeline state changes. 

Our systems design focussed on sevei^ai areas that resulted 
in large application performance gaois. We realized that 
tlie peiformance of ottr state c-haiige tm]jlementation coul<l 
significantly affect application peribrmance. We decided 
tliat this was important enough to require a redesign of 
the state change modules and not just tuning. x\pplying 
these considerations led us to implement immediate and 
defeiTcd validation schemes and i:»rovide redundancy 
checks at the beghming of each state change entry point. 

Validation. We implemented different immediate and de- 
ferred validation schemes' for rliffc^rt^nt crlasses of state 
changes. Geomel ry pipeline state i lumges are hantUetl by 
deferred validation because they tend to be more com- 
plex, requiring massaging of the state. They are also more 
interlocked because changhig one piece of state n^qLiires 
modifyitig another piece {>f state (for exaiitple. matrix 
changes cause changes to the light state). For us, deferred 
validation resulted in a simple desigji and increaseti per- 
fomtancCj reliability, andmaitttainability For fragment 
pipeline state clianges, we chose mtmediate vahdation 
because this state is relatively simple and noninterlocked. 

Redundancy Checks. Redundancy checks are done for ail 
OpentiL API calls. Becatrse otu^ analysis showed that ap- 
plications often call state ehaitging j outines vvitli a rethui- 
dant state ( for example, n e w va I u e= = c u rr e nt v a 1 a e) , we 

' Validgtfon is The mechanism that verffies thai the current specified state is legal, com- 
putes derived infoimatiDn from Ihe current state necessary tor rendering |for ej<amp[e an 
inverse matri)( for Jighting based an the current moM matrix), and l^eds tiTt hardware 
Willi the new state. 
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wanted a design in wliich tliis case peifoiins well. There- 
fore, our design includes redundaiYcy checks ai ihe bt^gin- 
ning of each state change entry point, wMch cillows a tiiiick 
return without exercising the unnecessaiy validation code. 

Results. P'or state-change intensive applications, these 
design decisions put us in a leadership position for 
OpenGL application peiformancej and we acliieved 
greater than a 2x peifomiance gain over our pre\ious 
graptiics libraries. Smaller application performance gains 
wei c achieved throughout oiu* OpenGL implementation 
with the state-c^iiinge design. 



Conclusion 



ISVs and customers indicate that we liave met oiu' appli- 
cation leaderslup price and pcrfornuuK'c goals tiiat we set 
at the stait of the program. Wo have also exceeded the 
perfc:irmance metrics we committed to at the heginning of 
the project. For more information reg^nriing our jjeribr- 
mance results, \isit the web site; 

http://www.spec. 01 g/gpc/opc 

For long-term sustain ability of our price and performance 
leadersliip, we have continued working closely with oiu^ 
ISVs to tune olu" unpIemer[tatiou in ^ueas that improve 
apphcation perfoiinance. In addition, new CPUs are 



pi aimed tJiat will allow our implementation to Rin faster 
widiout any effoit on our patt, and cost reductions are 
continuing ui grapiiics hardwaie. 

The goal to develop an implementation tliat can support a 
wide range of GPU or gi^aphics devices has already been 
dcmonstratcHi. We support three graphic;s devices that 
have differt^nt perfoiniance levels (all Ij^isefl on the same 
hardware aichitecture) and a pure softwaje implementa- 
tion that supports .simple frame buffer devices on UNIX 
and Windows NT systems. 
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The DirectModel Toolkit: Meeting the 3D 
Graphics Needs of Technical Applications 



Brian E. Cripe 



Thomas A. Gaskins 



The increasing use of 3D modeling for highfv complex mechanical desiL 
led to a demand for systems that can provide smooth interactivity with 3D 
models containing millions or even billions of polygons. 
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irectModel* is a toolkit for creating technical 3D graphics applications. 
Its priitiaiy objective Is to provide tiie perfonnance necessary for interactive 
rendering of large 3D geometry models containing miilions of polygons. 
DirectModel is implemented on top of traditional 3D grapMcs applications 
progi^anmiing interfaces (APIs), such as Starbase or OpenGL'^'. It provides tJie 
application developer with high-level 3D model management and advanced 
geometry culling and simplification techniques. Pigare 1 shows DirectModel's 
position within the architecture of a 3D graphics application. 

This article discusses the role of 3D modeling in design enguieering today, the 
challenges of implementing 3D modeling in mechanical design automation 
(MDA) systems, and the 3D modeUrig capabilities of the DirectModel toolkit. 

Visualization in Technical Applications 

The iole of 3D Data 

3D grapiiics is a diverse field that is eryoying rapid progress on many fronts. 
Significant advances have been made recently in pliotorealistic rendering, 
animation quality low-cost game [)latformSp and state-of-tlie-art immersive 

' DlraciModel was joirrtJy developed by Hewlett-Packard and EnginesrSng Animation Incorporated of Amej, towa. 
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Application architecturQ. 
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Figure 2 

A faw-resoiation image of s 3D modet of an engine 
consisting of 150,000 pafygons. 




\irtiial reality'^' applications. The Inteniet is populated 
wit h -3D \iitiial worlds ai^d software catalogs are fiitl of 
applications for creating them. Aii example of a 3D model 
is shown in Figiire 2. 

Wtiat do these developments mean for the users of tech- 
nical applications (the scientists and engineers w^ho pio- 
neered the use of t3D graphics as a tool for solving com- 
plex problems)? In many ways this technical comm unity 
is following the same trends as the developers and usens 
of nontecluiical applications such as 3D gmnes Biid intcu- 
aftivo virtual worlds. Thc^y arc uit(*rt\sttni in finding less 
cxj>ensivo systems for doing I lieir work, their image 
quality standards are rising, and their patience with poor 
Interactive performance is wearing thin. 

However, there are other areas where the unique aspects 
of ;3D data for technical api^lications create special require- 
ments. In many applications the images created from the 
3D data that are displayed to the user are the goal. For 
example, the player of a game or fiie pilot in a flight simu- 
lator cares a lot about the quality and interactivity tjf 

■ Innmef&ive viriual reabty is a lechrfOfogy that "immerses" \\ie viewer into a virtual reality 
scena with head-mounted displays thai change vWiat is viewed as ttie user's head rotates 
and with gloves that sensa where the users hand is positioned and apply farce feedback. 



the images, but cajes very little about the data used by tlie 
system to create those images, hi contrast, many techni- 
cal usejs of 3D graijliics consider llunr iknu tu Ijc^ the most 
important component!. The gcjal is to creatCt analyze, or 
improve the data, and 3D rendering is a useful means to 
that enci. 

This key tlislinction between data that is the goal itiself 
and flata that is a melons to an emi leads to m^jor fiiffer- 
ences in the architectures and techniques for working with 
those data sets. 

3D MIodel Complexity 

rndtM^standing tlic vciy centra! role that data holds for 
the technical 'AU gra^jhics user innnediatiiy leads to the 
iliiest ions of what is that data imi] wtiat are the sigmlicant 
trends over time? Tlie short, answer is that the size of tlie 
data is big and the amoimt and complexity of that data is 
increasiiig rapidly For examplen a mechanical engineer 
dohig stress analysis may now be tackling problems 
modeled with mUlions of polygons uistead of the thou- 
simds that sufficed a few years ago. 

The trends in the mechanical design automation fMDA) 
indiistiy are good examples of the factors causing this 
growth. In the not-too-distant past mechanical design wets 
accomplished using paper and pencil to create part draw- 
ings, which were passed on to the model shop to create 
protot^ype parts, ami then they were assembled into proto- 
type products for testing. The iirst step in compttterizing 
this process w^ls tlie advent of 2D mt*[*hanical drafting 
applications that allowed the mechanical engineers to 
replace their drafting boards with computers. However, 
the task w'as still to produce a paper drawing to send to 
the Tmjdel shop. The next step was to replace these 2D 
chafting applications with 3D solid modeler s that conld 
model the complete 3D geometry of a part and support 
tasks such as static! and dpi ami c design analysis to lind 
such things as the stress i:)oints when the parts move. This 
move to 3D solid modeling has had a big impact at many 
companies as a new teclmique for designing parts. How- 
even in many cases it has not resulted in a finuiamental 
cliange to the process for designing and manufacturing 
whole products. 

Advances. In the last few years advances in the mechan- 
ical design automation industry have increasingly 
addressed virtual prot otjTiiTig and other whole-product 
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Fahi'enheit 



Hewleil-Packard. Microsoft, and Silicon Graphics are collabo- 
rating on a project, code-named "fahrenheit;" that will define 
the fyture of gfaphics technologies. Based on the creation of a 
suite of APIs for DirectX on the Windows ^^' and UNIX^ operat- 
ing systems, the Fahrenheit project will lead to a common. 
sxtensible architecture for capitalizing on the rapidly expand- 
ing marketplace for graphics. 



Fahrenheit will incorporate tne Microsoft DirectSD and Direct- 
Draw APIs with complementary technologies from HP and 
Silicon Graphics. HP is contributing DireclModel to this effort 
and is working with Microsoft and Silicon Graphics to define 
the best integration of the individual technologies. 



design issues. This desire to create new tools and 
processes that allow a design team to design, assemble, 
operate, and analyze an entire product in tiie computer is 
panlcularly strong at companies that Ji^anufacture targe 
and complex products such as airplanes, automobiles, 
anti laige industrial plants. T!id leading-edge companies 
pioneering these changes are finding that computer-based 
virtual prototypes aie much cheaper to create and easier 
to mociiJy ihan traditional physical prototypes. In addition 
liiey support ai^ unprecedented level of interaction cinioiig 
muitipie design teams, component suppliers, and end usens 
that are located at widely dispersed sites. 

This move to computerised whole-product design is in 
turn leading to many new^ uses of the data. If the design 
engineers can interact online with their entire i>roduct, 
then each department involved in product development 
will want to be involved. For example^ the marketing 
(k^Iiarlinent wants to look at the evolving design while 
[jlanning their marketing campaign, the manufacturing 
flepartment wants to use the data to ensure the product's 
manufacturability, and the sales force wants to start 
slvow^iiig it to customers to get their feedback. 

These tasks all drive an increased demand for realistic 
niociels that are complete, detiuled, and afcui'ate. For 
example^ mechanical engineers are demanding new levels 
of realism and interactivity to supiiort tasks such as posi- 
tioning the fasteners that hold piping and detecting inter- 
ferences created when a redesigned pail bmnps into one 
of the fasteners. This is a standard of realism that is very 
different from the photorealistic rendering requirements 
of otlier ap[)lications and to the technical user^ a higlier 
priority. 



Larger Models. These trends of more people using better* 
tools to create more complete and complex data sets 
combine to produce %'ery large 3D models. To under- 
stand this complexiiyr imagine a complete 31) model of 
everything you see imder the hood of your car. A single 
part could require at leasl a thcnisand i)olygons for a de- 
tailed representation, and a product such as an automo- 
bile is assembled from thousands of parts. Even a small 
product .such as an HP DeskJet printer that sits on the 
comer of a desk requires in excess nf 300,000 triangles^ 
for a detailed model A cai^ door with its smoolii cur\^es, 
collection of controls, electric motors, and wiring har- 
ness can require one million polygons for a detailed 
model — ^the car's power train can consist of 30 million 
polygons.- 

These numbers are large, but they pale in comparison to 
the size of nonconsimier items. A Boeing 777 airplane 
contains approximately 132,500 unique parts and over 
3,000,000 fastenei^s;^ yielding a 31 > model containing more 
than 500,000,000 polygons. * A study that examined the 
complexity of naval platforms determined that a sub- 
marine is approximately ten times more cH}mplex than 
an aiiplane, and an aircraft carrier is approximately ten 
times more complex than a submarine.^ 3D models con- 
taining hundi^eds of millions or billions of polygons are 
real today. 

As big as these nmnbei's are, the problem does not stop 
there. Designers, manufacturers, and users of these com- 
plex products not only want tjo model and visuali^ the 
entire product, but they also w^ant to do it in the conlext 
of the manufactiuii^g process and in the context in which 
it is used. If the ship and the dry dock can be realistically 
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modeled and combined, it will be far less expensive to 
find and correct problems before they are built 

Current System Limitations 

If the task faced by technical users m to interact \^ith vi^^ry 
huge 31) model ts, how are the currently available systems 
doing? la a w(jni, badly. Clearly the graphics piiieline 
alone is not going to solve the problem even with hard- 
ware acceleration- Assuming that rendering performance 
tor reasonable interactivity must be at least 10 Iranies per 
second, a pipeline capable of rendering 1,000.000 poly- 
gons per second has no hope of interactively rendering 
;my model larger than 100,000 polygons per frame. Even 
the HP VISUALIZE fx^\ ihe world s tasiesi desktop grapli- 
ics system, which is capable of rendering 4.6 million 
triangles per second, can barely provide 10 frames per 
second interactivity for a complete IIP Desk^Iet jirinter 
niodeL 

This is a sobering reality faced by many mechanical 
designers and other technical users today. Their systems 
work well for dealing with individual comptjuents bnt 
come up shorl when facing tlie L'oniplete probleuT. 

Approaches to Solving the Problem 

There are several approaches to solve the problem of ren- 
deling very complex 30 models with interactive peilbr- 
niance. One approach is to hicrease the performmice 
of the graphics hardware. Hewlett-Packard and other 
graphics hardware vendors tu^e investing a lot of effort; 
ill tills approach. However, incTeasing hardware perfor- 
mance alone is not sufficient because the complexity 
of many customers' problems is increasing faster than 
gains in hardware performance, A second ax>proach 
that nnist also be exi^lored involves iishig softw^aie algo- 
rithms to reduce the complexity of the 3D models tliat 
are rendered, 

Complejc Data Sets 

To understand U\e general data complexity problem, we 
must examine it from the perspective of the application 
developer. If a developer is creating a game, then it is 
perfectly valid to search for ways to create the imagery 
wiiHe mhiimiziiig the amount of data behind it. Tliis ap- 
proach is ser\^ed well by techniques such as extei\sive 



use of texture maps on a relatively small amoimt of ge- 
ometry- However, for an application responsible for prt>- 
ducing or analyzing tecluiical data, it is raiely t"flt*ctive tf) 
iniprove the nmdering perfornmnce by manually altering 
and reducing the data set. If the data set is huge, the ap- 
plication musi he able to make the best of it during 30 
rendering. Unfoitunately, the problem of ex|)oiiential 
growth in data complexity cannot be solved through 
incremental improvemeiiLs to the peif ormance of liie 
cniTent 31) graphics iirc!hitectureH — new approaches iirv 
required. 

Pixels pfir Polygon 

Aithouglt the i^roblern of interactively renderuig large 31) 
models on a typical engineering woi'kstation is challenging, 
it is not inlractiible. If the worl<staiion s graphics pipeline 
is capable of rendering a sustained 200,000 polygons per 
second (a consen^ati^T estimate), then eac^h frame must 
be limited to 20.00(1 polygons to maintain 10 frames per 
second, A tyi>ical workstation with a 1280 by U)24 moni- 
tor presides 1,310,720 pixels. To cover tlus screen com- 
pletely with 20 J 000 polygons, eacth polygon must have an 
average area of 66 pixels. A more realistic estimate is that 
tlie rendered image cov^ers some subset of the screen, say 
75 percent, anil tliai several p<3lygtjns, for example fom; 
overlap on each pixel, which imphes each polygon must 
cover an area of approximately 200 pixels. 

On a typical workstation monitor with a screen resolution 
of approximately 100 pixels per inch, these polygons aie a 
bit more than O.l-inch on a side. Polygons of this size will 
create a higli enough quality unage for most engineering 
tasks. This image (quality is even more compelling w hen 
you consider that it is the resolution produced during 
interactive navigation, A much liigher-quality image can 
be rendered within a few^ seconds w^hen the user stops 
interacting with the model. Thus, today's 3D graphics 
w^orkstations iiave enough rendering power to produce 
the fast* high<juaiity unages required by the technicaJ 
tiser. 

Software AtgorltHms 

The challenge of interactive large model rendering is sort- 
ing through the millions of polygons in the model and 
crhoosing (or creating J the best subset of tiiose polygons 
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Figure 3 
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that ran be rendered m the rime allowed for the Irani e. 
Algoritlinis tliaf perfonii tliis geometry rediiclioii fall into 
two broad categories: cidliiig, which eliminates uimeces- 
saiy geometry; and simpllfhiUmn, which replaces some 
set of geometry witii a simpler version. 

Figure iilustrates tw^o types of culling: view fni^ turn 
cnlling (elimiiialing geometry tliat is outside of the users 
Held of view) and occlusion culling (elimiiiatiiig geometry 
that is hidtieri behind some other gef)tnetry' ). Tli(> art icle 
on page 9 describes the imi^lpinentation of occlusion cul- 
ling in tlie VISUAliZE fx graphics acc:eleraton 

Figures 4 and 5 show two types of simplification. Figure 
4 shows a fonn of geometry si mptifi elation called lessrUa- 
lion, which lakes a mathematical s[)ecifi cation of a smooth 
surface and creates a polygonal rep re.se n tat ion at the spe- 
cified level of resolution. 



Tlie decimation simplification technique is shown in 
Figure 5. This technique reduces the numl>er of polygons 
in a model by comhiniiig ac^acent faces and edges. 

The sinipMed geometry created by these algorithms is 
used by the level of detail selection algorithms, w^hich 

choose the appropriate representation to render for each 
frame based on critejia such as the distance to the object. 

Most 3D graphics pipelines render a model by rendering 
each primitive such as a polygon, line, or point indi%idu- 
alJy. If the model contains a million polygons, tJien tiie 
polygon-rendering aJgorithm is executed a million times. 
In contrast, these geonietrj^ reduction algorithms must 
operate on the entire 3D model at once, or a significant 
portion of it, to achieve adequate gains. View frustum 
ctdling is a good example — the conventional 3D graphics 
pipeline will perform this operation on each mdividnal 
polygon as it is reotk^red. However, to provide any signifi- 
cant benc^lit to the large model rendering problem, the 
culling algorithm miLst be applied glfibally to a large cliimk 
of the model so that a significant amount of geometiy can 
be eliminated with a single operation. Similarly, the geo- 
metry simi^lification algorithms can provide greatest gains 
when tliey ai'e a j:) plied to a large portion of the model 

Desired Solution 

The performance gap (often several orders of magnitude) 
between the needs of tlie teclinical user m\t^ the capabili- 
ties of a typical ^stem puts develcjpers of technicaJ appli- 
caticms into an unfortunate bind. Developers are often 
experts in some technicaJ domain that is the focus rjf their 
applications, perhaps stress analysis or piping layout. 
However the 31) data .sets that the applications m^mage 
are exceeding tlie graphics perfonnance of the systems 
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Geometry tessellation. 
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GeamBtry decimation. 
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Figure 6 

Extended graphics pipeline. 
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tliey nm on. Developers are faced with the choice of f>t>~ 
taining the 3D graphics cx])er1iso to err e ate a sophisticated 
rendering architectiire for their applications, or seeing 
their applications lag far l^ehind their customers' needs 
for large 3D modeling capacity and interactivity. 

To develop applications with the performance demanded 
by th(^ir customers, developers need access to grapliics 
systems that provide dramatic performance gains for tfieir 
tasks and data. As shown in Figure 6, the graphics pipe- 
line available to the apphcaiions must be extended to 
include model-based optimizations, such as cnlling and 
shnpiiTicalion, so that it can support interactive rendering 
of veiy large 313 models. When the grapliics system pro- 
vides this level of perfoimance, application developers 
mv frt^e to focus on improving the fuiictioniility of their 
ajjplications without concern about graphics perfor- 
mance. The article on page 9 describes tlie primitive- 
based operations of the pipeline shown in Figure 6. 

DirectModel Capabilities 

DirectModel is a toolkit for creating technical ;3D graplucs 
appht^ations. The engineer or scientist who must create, 
visualize, ajid analyze massive amounts of 3D data does 
not interact directly with Du'ectModel. Dii'ectModel pro- 
\ddes high-level 3D model managen\ent of large 3D geo- 
metry models containing millions of polygons. 11 uses 
advanced geometry siniplific^ation and cidling algorithms 
to support interactive rendering. Figure 1 shows that 
DirectModel is miplenienl ed on top of traditional 3D 
grapliics APIs such as Starbase or (.)pen(TL. It extends, 
but does not replace, the cm'rent softwai'e and hardware 
3D rendering pipeline. 

Key aspects of the DhectModel toolkit include: 

■ A Focus on the needs of technical applications that deal 
with large voliunes of 3D geometry data 



• Capability for cross-platform support, of a wide variety 
of technical systems 

■ Extensive support of MDA applications (for example, 
translators for conmion MDA data types). 

Technical Data 

As discussed above, llie midcrlying data is oflen tiie most 
important item to the user of a technical application. For 
example, when designers select parts on the screen and 
ask for dimensions, they want to know the precise engi- 
neering dimension, not some inexact dimension that re- 
sults when the data is passed tluough the graplucs system 
for rendering. DirectModel provides the interfaces that 
allow tlie application to specify and query data with this 
level of teclmical precision. 

Technical data often contains fai' more than graphical in- 
formation. In fact, the metadata such as who created the 
niodel, what it is related to, and the results of analyzing it 
is often much larger than the graphical data tliat is ren- 
dered. Consequently DirectModel provides tlie interfaces 
that allow an application to create the links between the 
gi^aphical data and the vast amount of related metadata. 

Components of laige models are often created, owmed. 
and managed by people or organizations that are loosely 
connected- For example, one design groLip nught be 
responsible for the fuselage of an aiiTijlane wliile a sepa- 
rate group is responsible for the design of the engines. 
DirectModel supports tills multiteam eollab oration 
by allom^ing a 3D model to be assembled from several 
smaller 3D models that have been independently defined 
and optimized. 

Multipia Representations of the Mode} 

The -30 model is the central ctmcept of DirectModel— the 
application defines the model and DirectModel is respon- 
sible for high-performance optimization and rendeiing of 
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Figure 7 
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it Thp 3D niodp) is defined hierarrhirjilly by the model 
graph, which consists of a set of nodes hnked together 
into a directed, acyclic graph. However, a common prob- 
lem tlicit occurs when creating a model grapli is the con- 
Hict l)etwt*en (he needs of the application needs and the 
giaphirs sysleni. The apf)licati()ii lypiciUiy needs to orga- 
nise the model based on ihe logical relationships be^ 
tween the components, whereas the graphics system 
needs to organize the model based on the spatial rela- 
tionships so that it can be efficiently simplified, culled, 
and rendered. Figure 7 shows twt:> model graphs for a cai; 
one organised logically and one spatially. 

Graphics toolkils that use a single model graph for both 
the apphcalion s interaction with iJie model and for reie 
dering tJie model force the application developer to opti- 
mize for one itse while making the other use difficult. In 
contrast, DirectModel maintaiiis multiple organizations of 
the model so that it can simnlUineously be ojitimised for 
several different uses. Ttie application is free to organize 
its model gniph based on i1,s fim<*tional reiniirement^ 
witliont c*onsi deration of l)ir(:^ctModel"s rendering needs. 
DirectModel will create mid maint^iin an additional spati^il 
Gi^anization that is optimized for rendering. These multiple 
organizations do n<:jt significantly increase the memory or 



disk usage of DirectModel bt^canse ihe jM^tuaJ geometry* 
by far the largest component is mukiply referenced, not 
duplicated. 

Tha Problem of Motion 

Object motion, both predefined and interactive, is critical 
to many tecluiical applications. In mechanical design^ for 
example, users waiil to see suspension systems moving, 
engines nicking, m\t\ pistons mid valves in motion. 7b use 
a viilual prototype for manufacturing pUmning, motion is 
mmidatory. Assembly sequences can be verified only by 
observing t he motion of each component as it moves into 
X>lace along its prescribed path. L^sei^ tdsu want to grab 
an object or subassembly and move it through space, 
while bumping and jostling the otjrject as it interferes with 
other objet t,s in its path. In shori, motion is an essential! 
{component for creating the level of realism necessai^ for 
full use of digital prototypes. 

DirectModel supports this demand for adding motion to 
3D models in severjil ways. Because DirectMock4 does not 
t'orce an apiilicalion to crt^ate a model gi^aph Mvai is opti- 
mized for fast rendering, it can instead create one that is 
optimized for nKuiaging motion. Parts that are ijbysirally 
cfjiinerted in real life can be comiecieci in ihe model graph, 
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allowing nioveinent to c^ascacle easOy IJirough all of the 
alTcctc^l parts. In addition, the data stnictures and algo 
rithms used by DirectModel to {)p(inHze the model giaph 
for rendering art* designed for easy incremental update 
when some portion of the application's inodel graph 
changes. 

Models as Databases 

3D models containing millions of polygons with a rich set 
of rendering anributes ami metadata can easily require 
several gigab^l^es of data. Models of tliis si^e are fre- 
Qiiently too Ijig to be completely held in main memory, 
wliich makes it particularly challenging to support, 
smooth interactivity. 

DirectModel solves this problem by treating tlie model as a 
database that is held on disk and incrementally brouglit in 
and out of main memory he necessary; Elements of iJit^ 
nujdel, including individual level -of -detail representations, 
must come from disk as they are needed and removed 
fiTim main memory when they are not needed. In this way 
memory can be resented for the geometric representa- 
tions currently of interest. DiroctModel's large model 
capability has as much to do with rapid and inteUigent 
database interaction as with rendering optimization. 

Interactive versus Batch -Mode Data Preparation 

Applications that deal witl^ large 3D models have a wide 
range of capabilities. One a]:)ptication may be simply an 
interactive \iewer of iLirge motiels that are assembled trom 
existing data. Another applic;ation may be a 3D editor (for 
example, a solid modeler) tJiat supports desigiiing me- 
chanic aJ parts witliin the iMmt ext of their fuh ^issembly. 
Consequently, m\ application n\ay acquire and optimiz.e a 
large aniomit of 3D geometry aU at once, or the parts of 
the model may be created little by httie. 

DkectModel supports both of these scenarios by allomng 
model creation and optimization to occm^ eitlier interac'- 
tively or in batch mode. If an apphcation has a great deal 
of raw geometr>^ that must be rendered, it wili typlcMy 
choose to pi'ovidc^ a batch -mode preprocessor that builds 
the model graph, invokes the sorting and simplification 
algoritlims, and then saves the results. An interactive appli- 
cation can then load the optimized data and inmiediately 
allow the user to navigate tlirough the data. However, ii' 
the apphcation is creating or mochft^ing the elements of 
the model at a slow rate, then it is reasonable to sort, and 
optimize the data in rea] time. Hybrid sc^enarios are also 



possible where an interactive application performs Incre- 
mental optimization of the moflel with any spai*e CPU 
cycles that are avaiJable. 

The jniportant thing to note in these scenarios is that 
DirectModel does not make a strong distinction between 
batch and interactive operations. All oi>erations can be 
consitlered interactive ^md the apphcation developer is 
fiee to employ them in a batch manner when appropriate. 

Extensibility 

Large 3D models iised by tecluiical applications have 
different characteristics. Some models ai^e lughly regular 
with geomeiiy laid out on a fixed grid (for example, 
re c I angu I ai h u i I d i n gs wi th re ctan gut ar ro oms ) whe re as 
others are highly inegular (for extuni>le, an automobile 
engine viith curved parts located at many different 
orientations). Some models have a higli degree of occlu- 
sion where entii'e parts or assemblies are hidden from 
many viewing i>erspective5. Other models have more 
holes tlirough tiiem allowing glhnpses of otherwise hid- 
den parts. Some models are spatially dense with many 
components packed into a tiglil space, whereas others 
are sparse with sizable gaj^s between the parts. 

These vast differences impact the choice of effective opti- 
mization and rendering algorithms. For example, highly 
regular models such as buildings are amenable to prepro- 
cessing tc) determine regions of visibihty (tor example, 
rooms A through E ai'e not visible from any point in room 
Z). However, this type of preprcH messing is not very effec- 
tive when applied to irregulai^ models snt^h as an engine. 
In addition, large model \Tsuidization is a vibrant fleld of 
research with innovative new algorithms appe^ing regu- 
larly. The algorithms that seem optimal today may appear 
very limiting tomoiiow. 

DirectModel's flexible architecture allows application 
developers to choose the right c:ombination of techniQues, 
including creating new algorithms to extend the system^s 
capabihties. All of the DirectModel functions, such as its 
cnllhig algoriilmis, representation generators, tesseha- 
tors, and picking operators, are extensible m this way. 
Extensions fit seamlessly into the ;:dgorithms thc^ ex- 
tend, mdistingnishable from the default capabilities in- 
herent to the toolkit. 

In addition, DirectModel supports mixed-mode rendering 
in wliich an application uses DirectModel for some of its 
remleilng needs and calls the underlying (^ore graphics 
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API directly for other rendering operations. Although Di- 

rectModel am MM the complete graphics needs of mimy 
applications, it does not require that ii be iised excliisi\'ely, 

lyiuKiplatfartn Support 

A variety of systems are comnionly used for todays tech- 
nical T^D grapliics apphcatians, ranging from high-end 
personal computers through various I'NTX-based work- 
stations iind superrompnters. In addition, several 3D 
graphics APIs and architectures are either established or 
emerging as appropriate foundations for technical applica- 
tions. Most developers of technical applications support a 
variety of existing systems and must be able to migrate 
their applications onto new hardwai"e architectures as the 
market evolves, 

DirectModel has been carefully designed and unplemented 
for optiniuni rendermg perf{)rmaiice on multixjle platiV)nns 
tmd operating systems. It presumes nt) pai'ticukir gi'aphic's 
API and is designed to select at nin time the grapliics API 
bejst suited to the platfomi or specified by the application. 
In addition^ its core rendering algorithms dynamically 
adapt themselves to the perfonnance requirements of tlie 
imderlying grapliics pipeline. 



Conclusion 



The increasing use of 3D grapliics as a powerful tool for 
solving technical problenis has led to an explosion in the 
c'omplf*xity of prot)lems being addressedj resulting in 3D 
models containing millions or even billions of polygons. 



Unfortunately many of the applications and SD ^^phics 
systems in use today are built on architectures designed 
to handle only a few thousands polygons efficiently. 
These architect iires are incapable of providing inter- 
acti\it>^ with todays large technical data sets. 

This problem has created a strong demand for new graph- 
ics arcliiiectm'es and products that are designed for inter- 
active rendering of lai'ge models on aHordabie systems. 
Hewlett-Packard Ls meeting this demand with Direct- 
Model, a cross-platform toolkit that enables Interaction 
with large, complex. 'M> models. 
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An Overviewof the VISUALIZE fx Graphics 
Accelerator Hardware 
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Three graphics accelerator products with different levels of performance are 
based on varying combinations of five custom integrated circuits. In addition, 
these products are the first ones from Hewlett-Packard to provide native 
acceleration for the OpenGL® API. 



T 



Aw VISUALIZE fx family of graphics subsystems consists of three 
products, fx^, fx^, and fic^, and an optional hardware texture mapping module. 
These products are built around a common architecture usuig the same 
custom mtegrated ckcuits. The primary difference between these controllejns 
is the number of custom chips used in each product (see T^ble I). 



Table 1 






Numbm^ of custom chips 
VISUALIZE fx products 


in the different 




Texture 
Product Chip 


Geometry 
Chip 


Raster 

Chip 


fx^ - 


1 


2 


fx^ 1 


2 


^ 


fx*^ 2 


3 


4 



A cliip-levcl block diagimn of Liie VISUALIZE £x^ produci is shown in Figure 1. 
This is the most complex configuration and also the one with the highest 
performance m the product Hue. The VISUALIZE bc^ and tlie \TSUALIZE fx- 
products use subsets of the chips used in the fs^\ Tiie fx*^ and fx'^ subsystems 
have support for the optional hardware-accelerated texture map modiUe, 
which contains a local texture cache for storage of texture map images. If the 
texture accelerator is not present, the bus between the interface clup and the 
first raster chip is directly connected. 
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Figure 1 

4 chip-levef diagram of the VISUAUZE fx^ product 
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Irtterfacfi Chip 

The interface chip provides a PCI 2 J (also re ferret! to as 
PCI 2X) compliarit iiittn^face/ It operates at up to (16 MHz 
in 64-bit mode. Special efforts have been made iii tlu^ 

* PCI = Peripfvaral ComponerTt tnlerconfiscl- 



design of Ihe bufft^ring aiul the intorfaco to the PCI. As a 
resuil, tiie driver is ah|p to stjslain writes of:^) geometry 
commands to the PCI at almost the tJieoreticaJ maximum 
rates that (*ouId be computed for the PCL The article on 
page 51 discuHst^s PCI capability. 
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Occlusion Culling 



The HP fast-break program (page 8) enabled us to understand 
customer requirements by analyzing what is important in 
OpenGL graphics today. As a result, we developed a technol- 
ogy called occlusion culling as an extension to OpenGL and 
implemented it in the VISUALIZE fx graphics hardware. 

We found that the data sets many graphics workstation cus- 
tomers are trying to visualize are very compfex. These data 
sets have large numbers of small, complex components that 
are not always visible in the final Images. For instance, when 
rendering an airplane, all of the MCAD parts are present in the 
data set represented by potentially millions of polygons that 
must be processed. However, when this airplane is viewed 
from the outside only the outer surfaces are visible, not the fan 
blades of the engine or the seats or bulkheads in the interior. 

In a traditional 3D z-huffered graphics system, all polygons in 
a scene must be processed by the graphics pipeline because it 
Is not known a priori which polygons will be visible and which 
ones will be occluded {not visible), The notion of occlusion 
culling, or removal of occluded objects, has been talked about 
In the research community for several years. However imple- 
mentations tend to be in software where the performance is 
not at a satisfactory leveL 

In the VISUALIZE fx series of graphics devices, HP developed 
a very efficient algorithm that tests objects for visibility 
An application program can very quickly use the occlusion 
culling visibility test to determine if a simple bounding box 



representation of a more complex part is visible. Since a 
bounding box, or more generally a bounding volume, com- 
pletely encloses the more complex part, it is possible to know 
a priori that if the bounding volume is not visible then the 
complex part it encloses is not visible. Thus, the part that is 
not visible does not need to be processed through the graphics 
pipeline. The real benefit of occlusion culling comes when a 
very complex part consisting of many vertices can be rejected, 
avoiding the expenditure of valuable time to process it. 

For very complex data sets, such as the airplane mentioned 
above or an automobile, a tremendous performance increase 

can be realized by using the HP occlusion culling technology. 
To date, several ISVs have begun using occlusion culling in 
their applications and are seeing a 26 to 1 DO percent increase 
in graphics performance. This magnitude of performance bene- 
fit typically costs a customer several thousand dollars for the 
extra computational horsepower. HP includes this technology 
as standard in all VISUALIZE fx series graphics accelerators, 
giving even better price and performance results to our 
customers. 

The future of 3D graphics will continue toward visualizing ever 
more complex objects and environments. Occlusion culling 
together with HFs DirectModel technology (page 191 are 
well positioned to be industry leaders in providing the technol- 
ogy for 3D modeling applications, 



The primaiy responsibility of the interface chip is to sepa- 
rate tlie streams of data that arrive from the host SPU into 
three paths and arbitrate access among those palJis. 

3D Path. TypicaJly data from the host CPU looks veiy 
much like the OpenGL API functions themselves. Data 
following this first path is roiUed to tlie geometi'y cliips. 
The geometry chips process the data and return the re- 
sults to the interface clnp. These results aic then sent on 
to the textm'e chips or dii*ectiy to the raster chips if the 
texture mapping subsystem is no! installed. In either case 
the data is transmitted to imd througli all the textmp and 
raster chips m tJie .system. 

Unbuffered Path. Tliis path passes data directly through 
the mterface cMp to tiie texture and raster chips. This 
provides a bypass method that allows traffic to get around 



other pending operations- An example would be a texture 
cache download that is requiretl to complete a primitive 
thai is ciurently bemg rasterized, a situation that would 
lead to deadlock without tiie unbuflered ]:>atii. 

2D Path, Tills path iims directly tiwough the interface chip 
to the texture and raster cliips. The 2D path differs from 
the mi buffered path ui Uie way its priority is handled. The 
inlerface clup manages priority among the tiu^ee pal lis as 
they all converge on the same set of wires between the 
interface chip ^md the first texiiire chip. Tlie unbuffered 
path goes directly through the interface chip to those 
\^ires and has priority over the other tw^o paths. Data 
targeting the 2D path is held off mitil all preceding 3D 
work in the geometry chip has been flushed dirough to 
tiie first texture chip. 
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Tliere is also special drculti^ in the mterface chip that is 
used to accelerate many operations commonly done by 

XI lor other 2D APIs- 
Buses 

The three primary buses in the sj^stem are each run at 
200 MHz, allowing sustainable transfer rates of more 
than 800 Mbytes per second. To control the loading on 
the interconnections for these buses, they are built as 
poijit-to-point coiuiections from one clup to the next. 

Each cliip receives the signals and then retransmits them 
to the next chip in tlie sequence. This requires more pins 
on each piirt, but limits the nmtiber of loads on each wire 
to a single recei\^cr as well as limiting the firing length 
that signals must traverse. Tim allows for reliable com- 
munications despite the Mgli frequency of the buses. 

The first of these three buses distributes work to the 
geometry chips. Tills bus starts at the interface crhip 
and nms through all ihe geometry chips in the system. 
Each geonietr>' chip monitors the data stream as it flows 
through the bus and picks off work to operate upon based 
on an algorithm that selects the least busy geometry chip. 

Tlu> st^cond of these buses starts at the last geometry chip 
imd paf>ses througli tlie others back to the interfact* chip. 
The results of the work done by the geometry chips is 
placed on this bus in Oie .same sequence as it was moved 
along the first bus. This strict ordering control prevents 
certiiui artifacts from sho\\ing up in the tiiial image. 

The third bus ties the interface chip to the texture and 
frame buffer subsystems. It is wired m a loop that goes 
back to the interface chip from the last chiji in the < hain. 
3D operations typically flow from tlie intc^riace chip to 
the chips along this biLs, and when they evenhially get 
back to the end of the loop, they are tlirowu away 

For 2D operations, such as moving blocks of pixels 
ai'oitnd the frame buffer, the operation of the third bus is 
somewhat different. The movemcni of pixel data operates 
as a sequence of reads followed by a sequence of writes* 
Tlie reads cause data to be dumped frcmi live frame buffer 
locations onto the bLis and tlie results travel Ijack to the 
interface chi[). This data is tlien associated with new 
addresses and sent as writes back down the bus, ending 
up back at the frame buffer hut: in different locations. 

Besides the three primary l>u.ses menlioned above, 
there are ttu'ee second^ny buses in the system. The tii^t 



bus connects the interface chip to the video chip, Tliis 
provides video control, download of color maps, and 
cursor control. The second bus is a connection from each 
raster chip to the video chip. This path is used to provide 
video refresh data to display irame buffer contents. The 
final secondarj^ bus is a comiection from each texture 
chip to tw^o of the raster chips. Tltis path allows the flow 
of filtered texture data into the raster chips for conibina- 
Lion with nontexture fragment data. 

Gdometry Chip 

The geometry and lighting chips are responsible for taking 
in geometric priiiiitives ( j:)ointSj lines, triangles, and quad- 
rilaterals) and executing all the operations assoc;iated 
with the transform stage of the graphics pipeline (see the 
article on page 9 for more about the graphics pipeline). 
These operations include: 

■ Transfonnation of the coordinates ftoni model space to 
eye space 

■ Computmg a vertex color based on tlie lighting state, 
wMch consists of up to eight directional or positional 
hght soiu'ces 

■ Texture map c;alc: illations that include: 

n Enviromnent map calculations for textm-e mapping 
a Texture coordinate transfonnation 
o Linear texture coordinate generation 
D Texture projection 

■ Vit>w vtjluiue clipping and clipping against six arbitr:uy 
application-specified planes to tietermine whether a 
l>rimitive is completely visible, rejected because it is 
completely outside the view area, or needs to be 
reduced into its visible components 

■ Perspective projection transformation to cause 
primitives to look smalhu* tl\c fLuther away from 
the eye they are 

■ Setup calculations for rasterization in the raster chip. 

There were some mteresting problems to solve in the 
design of the dLstrihution and coalescing of work up and 
down die geotnetry chip dmsy chain. For example, load 
balancing, maintaining striti order in tlie output stream, 
aufl ensuring that operations, such as bmding of colors 
and normals to verilces, perform asrecjuired l>y OponGl., 



O 
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Fast Virtual Texturing 



Texture mapping, which is wrapping a picture over a three 
dimensional object has been used over the years as a key 
feature to enhance photorealism, reduce data set sizes, per- 
form visual analysis, and aid in simulations (see Figyre 1). 
Since texturing calculations are computationally expensive 
and memory access for large textures can be prohibitively 
slow, various workstation graphics vendors have provided 
hardware-accelerated texture mapping as a key differentiator 
for their product, 

A primary drawback of these attempts at hardware accelera- 
tion Is that dedicated local hardware texture memory is limited 



Figure 1 

A 3D textured skuli The VISUALIZE fx"^ and fx^ subsystems 
support a texture map acceleration option Pictured here 
is the use of 3D texture mapping OpenGL extensions with 
this option. This feature allows visualization of 3D data 
sets sucli as h/lRI imagss. 




in size and is expensive. To take advantage of the perfor- 
mance boost, graphics applications were constrained to tex- 
tures that fit in the local hardware texture memory, In other 
words, the application was responsible for managing this 
hardware resource. 

Noticing this obvious artificial application limitation in texturing 
functional ity, performance, and portability, Hewlett-Packard 
introduced, in the VISLIALIZE-48, a new concept in hardware 
texture mapping called virtual texture mapping. Virtual texture 
mapping uses the dedicated local hardware texture memory 
as a true texture cache, swapping in and out of the cache the 
portions of textures that are needed for rendering a 3D image. 
Thus, for texturing applications, these limitations were elimi- 
nated. The application could define and use a texture map of 
any size (up to a theoretical limit of 32K texels x 32K texels*) 
that would be hardware accelerated, eliminating the need for 
the application to be responsible for managing local texture 
memory. 

Using the local hardware texture memory as a cache also 
means that this memory uses only the portions of the texture 
maps needed to render the image. This efficiency translates 
to more and larger texture maps being hardware accelerated 
at the same time. Applications that previously could not run 
because of texture si?e limits can now run because of the 
unlimited virtual texture size. Also, with only the used por- 
tions of the texture map being downloaded to the cache, far 
less graphics bus traffic occurs. 

The system design of virtual texture mapping involved changes 
in the HP-UX operating system to support graphics interrupts. 

onboard firmware support for these interrupts, the introduction 
of an asynchronous texture interrupt managing daemon pro- 
cess, and the associated texturing hardware described in this 

*A Xeni^l is one eJement of a texture. 



Tlie output of the geometry chip's daisy chain is passed 
back through the interface chip. Generally, for triangle 
based priimtives, the output takes the form of plane equa- 
tions. As these floatiiig-]3ohi{ plane equations are returned 
frain the genome tiy chip to the* interfa^*e chip and passed 
on to the texture cMps, certain addressed locations m the 
interface chip will result in the floating-point values being 



converted to fixed-pomt values as they pass through. 
These fixed-point values are in a form the raster chips 
need to rasterize the primiiive. 

Tlie daisy-chmn design allows up to eight of the geometry 
chips to bcf used altlicmgii only three are apphed in the 
case of the VlStJAUZE be!' product at tliis lime. 
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article. Having a centralized daemon process manage the 
cache allows for cache efficiency, parallel handling of texture 
downloads while 3D graphics rendering is Gccurring. and shar- 
ing textures among graphics contexts. 

The VISUALIZE fx^ and VISUALIZE fx^ texture mapping 
options rncorporate the second generation advances in virtual 
texture mapping. Fufl OpenGL 1 .1 texture map hardware sup- 
port has brought about dramatic improvements in texture 
map download performance and switching between texture 
maps and new extended features such as 3D texture mapping. 
shadows (Figure 2), and proper specular lighting on textures 




I Figure 3i These features have made these products very 
appealing systems for texturing applications on workstation 
graphics. 

The texture mapping performance on these systems is very 
competitive. The VISUALIZE fx^ texture fill rate is about twice 
that of the VISUALIZE fx^ texture option. However fill rates 
alone do not describe how these systems perform in a true 
application environment Aggressive texture mapping applica- 
tion performance comparisons show two to three times per- 
formance superiority over similarly priced graphics workstation 
products. 



Figure 3 

AspQcular fit texture image. Correct specular lighnnQ of 
textured images can be achieved with VISUALIZE fx^and 
fx^ texture mapping options. 




Texture Chip 

The texture chip is responsible for accelerating texture 
mapping operations. Towards this end, it perforins three 
l)asic: functions: 

■ Maintains a cache of texture map data, rctino.stiiit^ cache 
updates for texture values required by current rendering 
operations as neecied (see ""Fast Virtual Texturing'' on 
page :32) 



I Generates perspective corrected textiue coordinates 
from plane equations representing triangles, points, or 
lines 

I Fetches and filtei's the texture data as spc^cified by the 
application based on whether the texture needs to bv 
magnified or minirulzetl \a lit the geometry it is being 
mapped to and passes tlie result on to the raster chips. 
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Raster Chip 

Thc^ r^islcr c*hip mal.erizes the geometry" into the frame 
buffer. This means it clelemiines which pixels are to be 
poteiitialiy modified and, if so, whether they should be 
modified based on %^ai1ous cmrent state values (intrluding 
the contents of the z buffer). Tiie raster chip also controls 
access to Uie Viulous buffers tliat make up the frame 
htil3t'er This includes the image l^utTer for storing tlie image 
cJisplaycd on tJie screen (potentially two buffere if double 
buffering is in effec^t), an overlay buffe*r (tial coiLtains im- 
ages that overlay the imtige buffer^ the depth or z buffer r 
for liidden surface removal, the steiicil buffer/ and aii 
alpha buffer' ' on the VISU^\LIZE fx*'. To accomphsli its 
work the raster chip perfornis foiii" lia^sic fiuictioos: 

■ Rasterize primitives described as pomts^ lineSj or 
tri^mglcs 

■ Apply fragment operations as defhied by OpenGL (such 
as blendhig and raster operations) 

a Control of and acx'ess tcD buffer uK^inrjr^^ including all 
the buffers described earlier 

■ Refresh the data stream Ibr the video cMp, hicluding 
handling wuithjws and overlays. 

Video Chip 

llie video chip pro\ides video functions for controlUng 
the data flow from the frame buffer to the display and 

' A stancil buffer is p&r pfKei data that can be updated when piKsl data is written and used 
to nestrin the modification of the pfxel 

' An alpha buffer contains per pixel data that describes coverage information about the 
pixel and can be used when blending new pi?!,el values wilh the current pixel value, 



mappmg data from values to color. The features of the 
video chip include: 

m Data mappmg to colors: 

D Two independent 4096-by-24-bit lookup tables 

P Four independent 25fi-by-3-by-8-bit lookup fables 
for image plajies 

a A bypass path for 24-bit true color data 

□ Two ind€*pendent 256-by-8-bit lookup tables for 
overlay planes 

■ Digital-to-analog conversion 

■ Video timing 
• Video output. 



Conctusion 



The \TSUALIZE fx family of products currently has a sub- 
stantial lead in not only price/performance measurements, 
but it also leads in performance independent of cost. 

For information regarciing how these systems compare 
against the competition, \isit the SPEC (an industry stan- 
daid body of benclimaiks) web page at: 

http:// w w w.iipec.org/gpc 
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HP Kayak: A PC Workstation with Advanced 
Graphics Performance 



Ross A. Cunniff 



z'® 



World-leading 3D graphics performance, normally only found in a UNIX'^' 
workstation, is provided in a PC workstation platform running the Windows 
NT® operating system. This system was put together with a time to market of 
less than one year from project initiation to shipment. 



c 




Ross A. Cunniff 
A setiinr softwar** f jigixuH^r 
Hi (he IIP IVrftJimiiiu'c^ 
J)pfik1fip t'ompiitin^ Opc^ui- 
ijxjtv Riyss Ciuuiifi' biis Ispf ji with I IP sinrp 1SJB5. 
He was the load soft wiuie eti^iivper for the AU 
device driver tis<?d in the HP KayaJc work'^iatiorL 
He conlliiues to be the lead -iJl) device dnver 
t^ngiriei r for higb-cnrl grapliirs producls. He 
ret E'ivecJ n BS degree in matJieniatirs jmrl a BS 
dt'grpv ii\ ciiitip liter seient'e in 1085 fruni liie 
University of Mew Me?dco. His professional 
interestR Itu-liKle inrnpnter graphics^ pariifii- 
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ompiiter graptiics workstations are powerful desktop computers used 
by a variety of technical professionals to perform their day-to-day w^ork. 
Traclitionaliy, such computers have ruii with a version of the UNIX operating 
system. In the past year, however workstations featuring Intel proc:eS5ors such 
as the Pentium™ Pro and Pentium 11 aiiti riimiing the Microsoft'^' Window^s NT 
operating system have begun to gain ground in both capability and market 
sliaj e, Hewlett-Packard has historically been a leader in the UNIX workstation 
business, hi February, 1997, Hewlett-Packard began a projec;! to [lut its high- 
perfomiance workstation graphics mto a PC workstation platfonn, 

Technical ChalNnges 

Fitting IIP workstation graphics into a Windows NT platform was not an easy 
task. The task was made more exciting with the addition of schedule pressure. 
The schiHlulo gave us only foui' months to reach functional completion and 
only tw^o montlis after that to finish the qualit;y assurance process. Tills schedule 
was made even more challenging because the hai^dware was not yet complete. 
It w£is difiicult at times to distinguish software defects from hardware defects. 
Tliis article describes how we overc:ame some? of tlie challenges we encoimtered 
whiki implementing this project* 

The Hardware 

The hardware for the HP Kayak workstation (Figure 1) is based on the 
VIStlALlZK fx^ graphics subsystem for real-time 3D modeling (see the article 
on page 28 J. lIowevLn; a couple of changes were necessary. First, to achieve 
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Figure 1 

An HP Kayak XW workstation. 




the lieifoniianc'o avtiilablc in the graphics hardware^ the 
bus iiiieiface had to be trhaiigt^d tioiii Uie st^idard Periph- 
eral Component Interconnect (PCI) to tlie accelerated 
graphics poit (AGP),t since no commodity PC chipset 
supported PCI 2X. With nonual industry-stantlard PCI, we 
would ha%^e been limited to IS2 Mbyt:es/s for I/O, which 
would have hurt our peif ormance on several important 
benehmai'lcs. With tlie accelerated graphics port, the avail- 
able 1/0 bandwidth increased to 262 IVIbytes/s. 

The second change necessary to the hardware was the 
addition of industry-standard VGA graphics. Dming the 

f AGP ts 3 bus that transfefs data lo and from a graphics accelerator. 



boot process of Wmdows NT, and at occasional intervals 
after ihat. the coniputer will ac^c^ess VGA grai^liics registers 
directly. To acliieve this, a VGA dauglitercaid was created 
that displays its gi'aphics through the \ideo feature tximiec- 
tor created for the UNIX ^ddeo solution. The maui grapliit^s 
hoard w^as modified shghtly. making it possible to dynimii' 
cally switch beti^^een VGA grapliics and VISUALIZE fx'^ 
grapliics. Figure 2 shows a hardware block diagram for 
an liP Kayak workstation. 

Windows NT Driver Architecture 

The fact that the hai'dware for the HP Kayak workstation 
is similar to the \TSirALIZE fx'^ hardwaie, which runs the 
UNIX operating system, made tlic^ software effoit much 
easier. However, many significant hm^dles had to be over- 
come to get the softwai'e running under Windows NT. 

The first challenge was the Windows NT device dri%^er 
ai-crhitectm-e (Figure 3), On HP-UX'', graplucs de\ice 
drivers have a laige amoimt {>f kernel support, allowing 
them to access the graphics hardware directly from user- 
level code without liaving to execute any special locking 
routines. This direct liardware access (DHA) method is 
not present on Windows NT. instead, all accesses to the 
hardware mnst be perfonned from the kernel (ring in 
Figure 3). 



Figure 2 

A hardware block diagram for an HP Kayak workstation. 
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Fortui^tely, the MSUALJZE fjr* architecture speciOes a 

buffered fomi of conuiiunication in which graphical copv 
niaiids are placed iiuo command data packets in a large 
buffer in the Imi'd ware. It w^as a sitnple task to modii^* the 
HP-UX drivers to access a sofhvare allocated contmand 
data packet buffer instead. WTien one of diese software 
buffers gels fidl, it is passed to the ring driver that for* 
wards the buffer to the hardware. 

The lighter-shaded modules in Figure S represent the 
Libraries that were deh%-ered by HP to support the \1SU- 
ALIZE fx'* hardware. The libraries in ring 3 (Hpicd.dll and 
Hpvisxdx.dll ) were fairly straightfon^^ard ports of the 
corresponding UNIX libraries IrbGLsl and libddvisxgJ.sl. 
The libraries in ring (Hpvisxmp.sys, Hpvisxnt.dIL and 
Hpvisxkx.dll ) had to be created from scratch to suppoit the 



FigiuT* 3 

The Windows NT device driver architecture. 
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Windows NT driver model. These modules make up about 
30 percent of the size of the ring 3 modnles. 

Integration with 2D Windows MT Grapliics 

The second challenge was to integrate the 3D OpenGL 

graphics support with the standard Windows NT giaphicaJ 
de\ice inteiface, Microsoft speciOes hvo methods tiiat can 
be used to do this. The first, called a miniclwnl drive}\ is 
a rasierizatlon-level OpenGL driver that uses the Micro- 
soft OpenGL software pipeline for Ughting and trans- 
formation. This dri\'er w ouid iiave been easy to create, 
b\it it would not have allowed us to take advantage of 
the hardw^are transformation and bghting provided by 
\TSUALIZE fx^. 

The second method, called an inslallable client driver, is 
a geometry-level OpenGL driver that leaves implementa- 
tion of the lighting and transfomiadon pipeline up to the 
driver wTiten The driver allows us full access to all 
OpenGL API routines. This is the route we cliose be- 
cause we already had a full implementation of OpenGL^ 
which we had created to run on the HP-UX oi)erathig 
system. This implementation w^as poried to Uie installable 
client driver model over a span of several weeks, while 
we added support foi' Windows NT multithreading. The 
bulk of the VISUALIZE fx"* graphical de\1ce inteiface 
dnver was wriii en by a separate team of exi:>erts withotit 
much consideration for 3D graphics acceleration. Thi.s 
enabled them to get tlie Windows NT display driver nui- 
ning in a short amount of t ime mid idlowetl thejn to con- 
tinue enhancing 2D iierforniimce with(jut severely im- 
pacting the 3D device driver team. Some of die results of 
these efforts are shown in Figure 4. 

Integrating the Windows NT Driver with Ring O 

A third challenge was to iutegriUc^ tlu* Windows NT driver 
with the ring poition of the OpenOL driver while niain- 
tiiining sepai'ate code leases for the diffen*ni tecmis. We 
decided to make our ring driver a separately loadable 
libraiy. This decision ke]>t the source code separates It 
enabled much fastei' etiitH:x>mpile-debug cycles, since it 
allowed tis to replace a portion of the ring driver with- 
out having to reboot the computer. However, the separa- 
tion added extra complexity because we had two ver>^ 
tiilTeren! drivers accessing the same piece of hardware. 
To solve this problem, we created a variable called a 
hardwaiT access lokeii. Each driver has a special token 
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Figure 4 

(a) A 3D image in a 2D environment (b) Several 3D programs in a 2D environment 
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that it places in the hardware acc^s token to indicate 

that it was the last driver to access tlie hardware. When a 
driver detects that the token is not its o\iii. ii executes 
procedures known as context save and eantert iTSiotv. 
The context save reads all applicable hardware state in- 
formation froni xhe device into software buffers. The con- 
text restore places tlie pre\4oiis!y saved state back into 
the hardware. This same mechanism is used to mediate 
hardwai^e accesses between different processes running 
DpentiL. 

Integration of VISUALIZE fx^ Arctiiteature 

A fourth challenge for the team was the integration of the 
\1SLl\LIZE fx^ stacked planes architecture (Tigure 5a) 



Figure 5 

(al VISUALIZE fx^ stacked frame buffer modsL (bl Windows 
NT offscreen frame buffer model 
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into the Wmdows NT en\ironment, WorkstationstHiifflh 
rionally ba^e ver>^ deep pixels, each pixel having tip to 
90 bits of information. This information includes support 
for such things as transparent overlays, double buffering, 
hidden surface remo\'al, and clipping. \^indows NT expects 
a slightly different model, in which the extra per pixel 
information is allocated in offscreen storage when a 3D 
rendering coniext is created (Tigure 5b), UTiat tliis means 
is that when the window state is changed f for example, 
when a ^^indow Is moved on the desktop), Windows NT 
does not make any special calls to the device driver. Tins 
presented a problem, siiice our j^facked planes architec- 
ture needs to keep all of the extra infonnation directly 
associated with Hie correct visible screen pixels. 

To fix this problem, we xised a Windows mechanism 
called a window object (Figure 6). The window object 
tracks a \\indow state and executes callbacks into our 
driver when a window state is modified. This added iui 
imfortunate amoynt of complexity mto our driver, since 
the window state is asynchronous to all other h^u-dwai*e 
accesses and not all of the window state informatitm we 
need was directly available to us. In addition, applic ations 
expect to be able to mix ^^Indows NT graphical devic^e 
interface rendering aiui 3D OpenGL rendering in the simie 
window. These two problems required us to add a double 



Figure 6 

Ttie components of a window object 
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bulTering niecliaiiisiii that actually copiers the i^hysical 
back buffer bh.s iuto iht* displaytid front buffer. This is 
siguiQcautly slower \hnii the native per pixel double buff- 
ering of VISllALIZK fx^. However, it fits bette;r inio the 
Windows NT model and enables aU applications to run. 
We still enable the native method foi apiilications and 
benchmai'ks that work correc^tly with it, since it is sigiiifi- 
caiitiy faster. 

Perfornidnce 

A llfth chalknige for the team w^as performance. h\ \ho 
graphics workstation market, peiformmicc is lisuiilly the 
main differentiator. The niosl popular single measure of 
performance in the PC graphics market is the OPC View^- 
perf benchni^u^k known as CDR$-03. ^ By July, 1997, we 
had achieved a Cr>RS-03 rating of 7^i — a perfonuaJX[;e 
level that exceeded all known competitors. This met our 
goals set at the beginning of the project. However, we 
were aw^are that the hardwai'e was caj^able of supporting 
rnuc!h higher peif ormance, Witli a god in mind of a SIG- 
GRAPH 97 announcement in August, we redesigned the 
fie\ice driver. The redesign optimized certain paths 
througb the driver, enabling much higher perfoiin^uice 
for tius benciunaik and for Jmpoitant applications such as 
Unigraptiics and Structural Dynamics Research Corpora- 
tion (SDRC). As a result, we were able to announce a 
CDRS-(); J rating of over 100 at SIGGRAPH 97. 

In addition to benclunark performance, the teain focused 
on application performance because it is typically tl\is 
measure that detemiines whether a customer will buy the 
product. We obtained a variet>^ of in-house applications 



and built up expertise in running the applications. We 
also obtained data scLs that represented typical customer 
workloads and acUusted various perfonnance paranietei's 
(such as display list size) to maximize pe if o nuance for 
the benchntcuk. Using tliis technit|ue. the perfonnance 
with some data sets was up lo 100 times faster 



Conclusion 



With VISUALIZE fx *, Hewlett-Packard has the fastest 
Wmciow'S NT graphics on tlie market. ^-'^ Integrated into 
the HP Kayak XW platfonn, the gi^aphics device and its 
sue c essoin wi\l help Hewlett-Pack^u-d mdntain its market 
leadersitip. 
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Concurrent Engineering in OpenGL® Product 
Development 



Robert J. Casey 



L. Leonard lindstone 



Time to market was reduced when tasks that had been traditiona 
were completed in para 1 1 eL 
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oncurrent engiiteering is the convei^eiice. in time aiid purpose, of 
imerdependent engineering tasks. The benefits of concurrent engineering 
versus traditional serial dependency are shown in Figure 1. Careful planning 
and management of the concurrent engineering process result in: 

■ Faster time to market 

■ Lower engineeriiig expenses 

■ Improved sehedule predicl^ility. 

This article discusses the use of concunent engineering for OfjeiiGL product 
development at the HP Workstation Sy stents Division. 

OpenGL Concurrent Engineering 

We applied concunent engineering concepts in the development of our 
OpenGL product in a number of ways, including: 

■ Closely coupled system design with partner laboratories 

■ Software architectttre and design verification 

■ Real-use hardware verification 

■ Hardware simulation 

■ Milestones and communication 

■ Joint hardware and software design reviews 

■ Test programs written in paralleL 
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Cultural Efiabfers 

In adflition to these technical laclic's. the OpenGL team 
enjoyed the benetits of several ciiltiual eiiablers tJiat have 
been niirtiu'ed iyvt^r many years to encoui^agG concurrent 
enghieeriiig. These include early conctuTent staffing, an 
enviromnent that in\ites, expects, and supports l)ottoniS"Up 
ideas to improve time to market, and the use of a focused 
program team to use expertise and gain acceptance from 
all functional areas and partners. 

System Design with Partner Labs 

We vi^orked cltiscly with the compiler and operating sys- 
tem laboratories to design new features to greatly im- 
prove our performance (see the ''System Design Results" 
section in the article on page 9). Our early system design 
revealed that OpenGL inherently requires approximately 
ten times mc^e procedure calls and graphics device ac- 
cesses than our previous graphics libraries. This large 
increase in system use meant we had to miiiimize these 
costs we previously had been able to ainoitize over a 
complete primitive. 



We worked closely with our partner laboratories to ensure 
success. Our management secured i>jirtner acceptance, 
fimding, and staffing, and tlie engineers worked on the 
joint system design. Changes of this magnitude in the 
kernel and the tompiler take time, and we c;ould noi af- 
ford tu wait until we had graphics haitiwai'c imd suff ware 
ruiming for problems to occur. Rather, we used cai efiil 
system performance models and competitiv^e performance 
projections to create processor state cotmt budgets for 
procedure calls and dc*\i€:c> acre ess. Tliese perfonnance 
goals guided our design, hi fact, our first design to improve 
proc^ociore cdl overhead missed by a few states per call, 
sc^ w** had to get more creative with om' design to arrive 
at an industry-leading solution. We nuinaged these de- 
pendencies throughout the project witii frequent commu- 
nication aiid mterim niilestones. 

Software Architecture and Design Verification 

We designed and Ibllowec! a risk-drivei^ life cycle. To snp- 
poit the concmTenL engineering model we needed a life 
cycle that avoided the big bang approach of integrating all 
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The benefits of concurrent engineering. 
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the pieces at tlie end. This woiittl resLdt in a longer and 
less predictable time to maiJtrt. Instead, we created a 
proluL>piiig environment. ITiis eiivironment was initially 
created t^) test the software architecture and early design 
decisions. Tlic lift* cycle includetl a mimlxu' of chi^ck- 
points focused on interface specificationj design, mul 
prototyping. 

One key prototyi>mg clieckpoint in this environment is 
what we calleti our "veitical slice/ which represented a 
tl lin, tall slice through the eaiiy OpenGL architecture (see 
Figure 2). Tliin because il suppoiis a small subset of the 
full OpenGL functionality, and tali because it exercises all 
portions of the softwiire architecture, from the AP] to the 
device driver-level inlerface. With this milestone, we had 
a simple Open(jL demonstration numing on our softwiu^e 
prototype. 

The objectives of this vertical slice were to verify the 
OpenGL software architecture and design, create a \wu\u- 
tyi^iiig design envii'onment, and rally the team aroiaid tliis 
key deliverable. 



Hardwfife Verifii^atifin 

Before we had completed verification of the software ar- 
chitecture, it became e\1deni that this same environment 
needed to be quickly adapted and evolved to handle the 
demands of hai'dware verification. OpenGL features and 
performance represented the biggest challenge for the 
new- VISUALIZE fx hardw are. Although tliis hardware 
would also support our legacy APIs (Starbase, PHIGS, 
PEX), most of the newness and therefore risk was con- 
tained in our support of OpenGL. By evohing our proto- 
typing en\ironnient for use as the hardware verification 
veliicle, we were able to exercise the haj'dware model in 
real-use scenarios (albeit considerably slower than full 
perfonnance). 

Evolving this enviromnent for hardware verification re- 
quired us to take the prototyping further ttian we would 
have for softw-trre verification alone. We had lo add more 
functionality to more fully test the OpenGL features hi 
hardware. We also had to do so quickly to avoid delaying 
die hardware tape release. 

Tills led to oor second key prototyping checkpoint, which 
we called **OpenGL ttmi on.'* This milesttine included the 
same OpenCiL demoi^iration nmuing on die \TSL^AL1ZE 
fx hardware simulator. We also added functionality 
breadth to the vertical slice (see Figure 2). Doing aU this 
for a new OpenGL API represented a new level of concor- 
rent engineering, in that we w^ere niiinrng Open(iL pro- 
giants on a prototyi^e OpenGL hbraiy and driver and dis- 
playing pictures on simulated VISUAIJZE fx liardw^u-e, till 
more than a yem^ bc^fore sliipments. 

Tlie key objective of this milestone was to verify system 
design across the API, driver, operating system, and hard- 
ware. Tlie system generated pictures and, more impor- 
tantly, spool files (command and data streams that cross 
the hardwaie and software Liiterface). These spool files 
lire then run gainst the hardware models to verify hard- 
ware design imder real OpenGL use scenarios. 

This prototyping environment has the following 
advantages: 

« Reduces risk for systeju design and compurient design 

u Resolve integration issues early 
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D Identify holes and design or architectiire flaws 
D Enable prototypmg to evaluate design altematives 

■ Enables key deliverables (hardware verification spool 
me.sj 

■ Creates exciting focal points for developers 

■ Fosters teiinivFork 

■ Enables joint development 

■ Provides a means to monitor progress 

■ Provides a jmtip start to our code development phase. 

This enxironment also Ivas potential downsides. We felt 
there waii a rii^k that developers woiUtl fet^J thai the need 
or desire to prototype (for system turn on and hardware 
verifieation) could overshadow the* iniportanee of product 
design. We did not want to leave engiuoei:s with the model: 
WTitP some code, give it a try, and ship it if it w^orks. 

Thus, tu keep the benefits of tliis environment and miti- 
gate these potentia] dowi^ides, we made a etmscious fle- 
cision to switch geai^ from system turn on and prototype 
mode to product code developntent mode. This point 
came afler w^e had dehvered the spool files required for 
hardware verification and before we had readied our 
design complete checkiKiint. From that point on, we 
prototyped oidy for design pui'poseSt not for enabling 
more system functionality. We also created exphcit check- 
points for repla( iiig previously prototyped code with 
designed product code. This was an unportant shill to 
avoid sliipping prototype code. All product code had to 
be designed and re\iew'ed. 

Hardware Simulation 

One key factor in our concutTent engineering process is 
liard%vare simulation. A detailed discussion of the hai'd- 
vvare siimiiation tecimiques used in our project are be- 
yond the scope of tliis aiticle. Briefly, we use three levels 
of hardw^are simulation: 

« A behavioral model (written in C) 

■ A register transfer level model (RTL) 

■ A gate model, which models the gate design and imjDle- 
mentation. 

Tlie advantages of the behavioral model ai*e that it can be 
done weU before the RTL ai\<l gate model so we can use it 
with other components and prototypes. The beha\ioral 



model is also significantly faster than the other models 
(though still about 100 times slower ih^m the real product), 
allowing us to Rin many simple real prr jgi ams on h. Tire 
RTL model runs in Verilog and runs about one nuliion 
times slower- th;m the real product. Tills Ihnits the number 
and size of test cases that cim be run. Thc^ gate model is 
even slower. Even so, we kept over 30 workstations busy 
aroimd the clock for months ninning these models. Often 
a simulation run will use C on)dels for all but one of the 
new clupSj witli the one ehip t>eing sintulated at the gate 
level. 

Milestones and Communication 

We set up a number of R&D milestones to guide and track 
our progress. Tlie vertical shce and OpenGL turn on were 
two such key milestones. OpenGL developer meetings 
were lield monthly to m;d<e sure that eveiy^one had a clear 
understanding of where we were headed and how each of 
the developers' contributions helped us get there. 

Software and Hardware Design Reviews 

The hardware and softw^are engmeers ^ilso held joint de- 
sign reviews. The value of design reviews is to minimize 
defects by enablii\g all tlie engineers to have the same 
model of the system and to catch design flaw^s early and 
conect them wMle defect fmding and fixing is still inex- 
pensive in terms of schedule and dollars. 

On the sohTware side, the review process focused heavily 
on up-front design reviews (where changes are cheaper) 
to get the design riglit. We maiiitamed the unportance of 
doing inspections but reduced tlie insijection coverage 
from 100 percent to a smaller representative subset of 
code, as deteniuned by the review teani. We also in- 
creased the number of review^ers at the design reviews and 
reduced the |)^u*ticipation as we moved to code reviews. 
We maintained a consistent core set of reviewers wiio 
foUowed the component from design to code review. 

Tests Written in Parallel 

To bring more paiallellsm to the development proc'ess, 
we had an outside organization develop our OpenGL test 
programs. By doing so, we were able to begin nightly 
regression testmg simultaneous witli tlie code e(jm]>letion 
checkpoint because the test programs were immediately 
available. Historically, the developers have written the 
tests foOowuig design and coding. Tliis translates into 



May 1198 • The HewleH-Paclord Journal 



)Copr. 1949-1998 Hewlett-Packard Co. 



a Ml betw'een the code completion checkpoint and the 
beginning of the t^rting phase. 

Parallel development of the tests with the design and 
impiemenmtion of tJie system was a key success factor 
in our ability to ship a high-qiiiilit>^ software-only beta 

version of our OpenGL product. No severe defects were 
found in this beta product — our Urst OpenGL customer 
deliverable. 

One tiling we learned from using an outside organization 
to help with test writing was that writing test plans is 
more a part of design than of testing. Hie developers, 
vvitii Ultimate knowJedge of the API and the design, were 
able to write much more comprehensive test plans than 
the outside organization, - 



Conciysion 



We achieved several positive results through the use of 
concurrent engineering on our OpenCiL product, t.lti- 
maiely, we reduced time to market by several months, 
.\long the way, we made performance and reliability im- 
provements in our softw^are and hardware ai*chitectures 
and implementations, and we likely pre\'ented a chip turn 
or two. whic"h would have cost significant time to market, 

Sflscon Gr^fcs and OprnGL am regtstetsii tmiemarks of Bficon Grs^itms kK,mtt}& Wittmi 
Stales Afid othef a}umfies 

DtmiiSD IS a U.S. fsgistef&i tradBmarkofMicrosaft Cofpomttoa 

Micmsaft is a US registered rmdsmsrk of Micwsoft C&p&mtfon 
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Multiple monitors can be configured as a contiguous viewing space to 
provide more screen space so that users can see most if not all of their 
appijcations without any special window manipulations. 



JLi 



•19i today's computiiig environinent, screen space is at a premium. The 
entire screen c^m be (easily coiismiied when prmiaiy work-specific applications 
are used together with browsers^ schedulers, mailers, and editors. This forces 
the user to continuously shuffle windows, which is both distracting and 
unproductive. 

The advanced display technologies described here allow users to incrrease 
productivity by reducing the tinie spent manipulating windows. Tliree 
technologies are discussed: 

■ Multiscreen 

■ Single logical screen (SLS) 

■ SLSclone. 

Implementation details and procedm'es for configiuing HP-UX workstations to 
use the SLS technology are described in references 1 and 2. 

Multtsoreen 

When consideruig the problem of limited screen space, one solution that 
comes to mind is to use a bigger monitor witii a iiigher resolution. 
Unfortimaiely, it is often impractical to add a monitor with a resolution liigli 
enough to accommodate all tlie data a user wants to view. Altl\ough demand has 
increased for monitors of higher resolution, such as 2K by 2K pixels, they are 
still too espei^ive for companies to place on every desktop. In addition, these 
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laigc iiionirore are cumbersome and heavy. There are also 
safety considerations: the monitor must be stable aBd 

properly supported. 

A more praeticalT cost-effective solution is to use addi- 
tionai standalone monitors to increase the amount of 
visible screen space. The X window System (XII) stan- 
dard incorporates a feature known as multiscreen, which 
provides this type of enviromneni. In multiscreen configu- 
rations, a single X server is used to control more than one 
grapltics device and monitor simultaneously. These types 
of configurations are only possible on systems containing 
iniilttple graphics devices. 

In these multiscreen scenarios, a single mouse and key- 
board are shared between screens. This allows the pointer 
but not the windows lo move between screens. Each ap- 
plication imisl be directed to a .specific screen ttj disi>lay 
its windows. This is done by either using the -display com- 
mand tine argument or by setting the DISPLAY environ- 
mem variable. 

Figure 1 shows a two-monitor niuldscreen configuration. 
Both monitors are comiected to the same workstation and 
aie controlled by the same X server This type of configu- 
ration effectively doubles tlie visible workspace. For e>cajii- 
ple, usei^s could have their aitemate applications, such ;is 
wet) browsers, mailers, and schedulers on the left-hand 
monitor and theii' primary applications on the right-hand 
monitor Sii\ce the X server controls both screens, the 
pointei^ can move between stTeens and be used with any 
application. 

Multiscreen offers the advantage that it will work witli 
any grapfiics device. There are no constraints that the 
graphics de\ices be identiciil or liave the same properties. 



Figin-e 1 

A multiscreen configuration. 



SPU wtih Two 
Graphics €ards 



displayi^O 





disFilay;0.1 






1 1 












V 


-/ 



Figxire2 

Ctjrscfr wraparound in a muiBscreen configurBtJon. 


(8| 




Screen 2 Screen 3 


^reen 1 


- 












Screen 2 Screen 3 

X- 

I 


IliJ 


[ Screen 1 



For example, on an HP 9000 Model 715 workstation con- 
taining an HCRX24 display (a 24-plane device) and an 
internal color grapMcs display (an S-plane device), the iLser 
can still create a multiscreen cordlgLiration. Of course, 
tliose apphcations (Mrected to tiie HCKX24 wiD have ac- 
cess to 24 pianos whilt* Lltose contained on liie oLlier are 
limited to 8 planes. Currently, the HP-UX X server allows 
a m^iximum (jf fcnn- graphics devices to be used in a multi- 
screen configuration. 

Tlie HP-UX X servTF also provides several enhancements 
to simplify die use of a nuiliiscreen configmation. If a user 
has a l-by-3 conilguratioti (Figure 2a), there may be a 
need to move the pointer from screen 3 to screen i. This 
requires nuning the pxjinter from screen 3 to screen 2 to 
screen L By specifying an X server configuration option, 
the user can move the pointer off the right cflge of screen 
3, imd the pointer will wi'ap to screen 1 (Figure 2b), The 
same screen wrapiJing functionality can be |>rovided if the 
user has configured the screens in a column. Finally, a 
2-hy-2 t'onfiguration can contain both horizontal and verti- 
cal sc^reen wrapping. 

Although multiscreen is convenient, it has shortcomings. 
Nmnely, the monitors fimction as separate entities, rather 
ihaji as a contiguous space. The diHerent screens within a 
muhiscreen ixmfiguration cminot commmiicate with one 
another with respect to wiruiow placement. This means 
that windows cannot be moved between monitors. Once 
a wijulow is created, it is bound to the monitor where it is 
created. Although some tlnrd-piuly solutions are available 
to help aiieviale this pn)[>Ieni, they are costly, inconve- 
nient (sometimes requiring the application to make code 
changes), and lack performance. 
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The lack of conutiunication between screens with respect 
to vtindow placement forces users to direct their applica- 
tions towards a specific screen at application stait time. 
After a screen has been selected all additional subwin- 
dows will be confined to that screen. With today's larger 
applications, it is possible to find Ihat certain screens still 
get overcrowded, resulting in tlie original predicament of 
having to iconify and raise windows. 

Single Logical Scre&n 

To remedy the shortfall of the multiscreen configuration, 
HP developed a technology called single logwal screen 
(SLS).'^ SLS has been incoipt>rateci int(] the HP standard 
X ser\Tr product and allnw^s nndtiple monitors to act as a 
single, larger, contiguous screen. As a result, windows can 
move across physical screen boundaries, and tliey can 
span more than one physical monitor In addition, SLS 
functionality has been implemented in an application- 
transparent manner This means that any application cnr- 
rently ninning on HP-UX workstations wUi run, without 
modification, mider SLS, Therefore, SLS Ls not an API that 
application writers need to program to or that an applica- 
tion needs to be aware of The application simply sees a 
large screen. This ease-of-use lets end user's take advan- 
tage of a large workspace withont requiring applications 
to be rewritten or recompiled. 

Many of electronic design automation (EDA) and compntei- 
aided design applications can benefit from SLS. Somc^ of 
these applications, by themselves, can L^asUy occnipy an 
entire screen while only showmg a fraction of the desireti 
inff)miation. For example, with more screen real estate, 
an EDA application can sinndtaneonsly display w- a ve- 
fomis, schematics, editors, and other data \\ithout ha\ing 
any of this information obscured. To do this on a w^ork- 
station with only a single monitor w^ould require display- 
ing the w^aveforms. sc^hematics, and other items in such 
smaU areas as to he unreadable. 

On HP-UX Workstations, a single logical screen actually 
represents a coDection of homogeneous giaphics devices 
whose output has been combined into a single screen. 
Figure 3. shows an example of a l-by-2 SLH configiu'a- 
tion. Most HP-LTX w^orksialions are not limited to only 
tw^o graplucs devices. Some models support up to fom* 
de\ices. When using these graphics de\aces to create an 
SLS environmenU cuiy rectangulai^ configuiMion is allowed. 



Figure 3 

A hby'2 SLS configurBtfon. 
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StSclone 

SLS clone is similar to the SLS configuration. The differ- 
ence is that the contents from a selectiHl monitor are 
replicated on all other monitors in the configuration (see 
Figure 4). A user can dynamically switch betvv^een SLS 
and SLS clone using an applet being shipped with the 
HP-UX 10.20 patch PHSS_12462 or later. 

This ftmctionality is useful in an educational or instmc- 
tional environment. Instead of crowding many users 
around a single monitor to \iew^ its contents, SLS clone 
caji be used to pipe these contents to neighboring moni- 
tors. As with SLS. SLSclone cui'rently suppoils tip to Ibia' 
physical monitors, depending on tlie workstation model 

SLSclone functionality easily lends itself to a collaborative 
W'ork environment. If afklitional people enter a user's 
office to debug scimtj softwai'e source code, for example, 
the user can quickly switch the SLS configuration mto an 
SLSclone configiu-ation, and the debugging screen will he 
displayed on iill monitors, AlsOj the aclditional monitor 
can easily be adjusted to the coiTect height and tilt witli- 
out affecting the original user's \iew of the display. 



figure 4 

An example of a t-by-i SLSclone conffgursthn. 
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Figure 5 

4 hytrid cor7figur§tion consis^ng of a hty-l SLS with mufti- 
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SLS and Mtiitlscreen 

Even with the benefits of SLS, there may be cases in 
which a user will w'ant to use SLS and niiiltiscreen at the 
same time. For example, a user could have a l-by-2 SLS 
configuration acting as one screen, and a third nioniior 
acting i\^ a second sc^reen. A depiction of tliis is sho\^ii in 
Figure 5. 

hi this type of configuration, a user can move v^indows 
between physical moiutoiis 1 and 2 but not drag a v^indow 
from monitor 2 to monitor 3. The pointer, however, can 
move between all n^onitors. Tlus type of hybrid configura- 
tion can be useful in a software development environment. 
All of the nece.'^smy editors, conipilom. mid fk»l ruggers t:an 
be used on monitors I and 2, and tht* apphcatioii caii be 
run and tested on monitor -l 

If a workst^ition supports ibur graphics devices, another 
possible* hybrid conflguration is to use two screens, 
each of which roasist.s of a tw^o-srrcen SLS c*on figuration 
(Figure 6). 

In this configuration, vtlndows can be moved between 
monitoi-s 1 jind 2 or between monitoiis 3 and 4. f lowever, a 
window cannot be moved between monitors 2 and 3. As 



with all multiscreen con£giiralions, the pointer can move 
across all four monitors. These two screens could also 
be platred veitlcally, resulting in a 2-by-2 monitor arrange- 
ment and a 2*by-l multiscreen configuration. 



Conclusion 



Advanced display configiirations can be used to increase 
productivity. The increase in screen space fiacilitates col- 
labomtion and communication of information. We have 
also fomid that these configurations are \'er>^ useful for 
uidependent software vendors (ISVs) who demonstrate 
their applications on HP-LTX %'orkstations. Tliey apj^reci- 
ate the addititjnal screen space because they are able to 
display more information and rapidly describe tlieir prod- 
ucts without losing their customers' attention. 

Finaily, the configuration of an advanced display is ac- 
complished in tm easy and straigiitfonvard manner through 
the HP-l^X System Admuiisiration Mcmager (SAM). Addi- 
tional information on advanced display configurations 
and other exciting X server features are available at: 
h ttp://www^. hp . com/go/xwin d o w 

HP-UX Hekase 10.20 df^^ lat^rdnd HP-UK llOQund iBt^iin bath 32- Btid B4-^it confiqm- 
tions} Of} aif HP 9000 cmpatais am Open Boup UNIX 95^ hsndB^ pfoducrs. 

UNiXis B TBgiswmd tmtimmtkQt The Qfm Smup 



References 



1. T Spencer and H Anderson, ^^Implementation of Advanced 
Display Terhnologies on BP-l^X WorkHUilioiis,*' Hf'wlHf-Pmkani 
Jourtml, Vot 49, no. 2, May 1998 (av.'iilalilt' (inline oiifv ). 

http;//www.hp,com/JH>J^Smay/ma98a7a,htrn 

2. R. MatrJorinld. ""Hewlett-Packanfs A])i>n>fitii trj Djiumiie 
LoadUij^ williUi tJie X Server/' Hewlett -Parkanl JovriifM, Vol- 4tJ, 
no. 2p May 1998 (available online only). 

hi ^rMww^^'. h p, ct>m/lip.j/9Bmay/ma^8a71) J»uu 

3. M. Allison, F. AJidersun, and J. WaUs, "Single Ixjgical Scretm/ 
[titerWorhi VT ProiVf dings. April IJJffJT, pp. 3(Kj - 370. 



Figiu'c 6 

Two hby-l SLS conffguratians combined via muftiscroBn. 

ifi$^play:0.Q 



disp1«v^^0.1 



C 



i 







1 1 1 n 









if, 





= 



o 



May 1596 ■ The HGwIetlPsckafdJoyrnal 



)Copr. 1949-1998 Hewlett-Packard Co. 




Todd M. Spencei' 

A 8f>frwiiie ei^Jijieer at 
the I IP Workstation Sys- 
terns Division, Todd 
Spent^er was responsible for di>velopri\f^nt of 
tile the SAM component thai, aliows usenj Ut 
set up m 1 1 It j screen and single loglt<d screen 
cnriliguratiaris. He came to IIP in 1989 after 
reeeiving aii MS degree in computer science 
Truni lilt* T 'UiverHiiy of L'tali. Todd was bom 
in I tall, is niiirrjed imd has four cliildren. His 
outside interei5ts include tro|Jlcal iii§h, (.ainij- 
ingf wc>{Kiw< irking, piano fplaj'ing eiajisical 
limbic-) ^ and jogging. 



^ 




Paul M.Anderson 
PriLj] AiidtT-sim iy 'A soft- 
waip engineer <it the HP 
Workstation Systems Di- 
vision, lie joined HP in 109G after receiving a 
BS degree in coniputer sciimcre I'roni the l"ni- 
versity of JVliiuiesata. Be is riirrently working 
on device diivei^ for new peripheral technol- 
ogies* His professional interests include I/O 
driversj operating systenLSi. mul ri«^i working, 
Paul was bom in Edina, Mium soI^l His tmt- 
si<le interests melude hikijig, niusic, an<:l 
mountain biking. 



P7^ I 

I ■ I David J. Sweetser 

• WiihHPsint:Rl977, 

j^^^^^^^^^i lJa\id yweetser is a 
^^^^^^^^^H ptt)Ject manager at lire 
t[P Workstation Syslems Division. He is re- 
sponsible for tlie X server ;ind some of ttie 
chent-side X libraries. He received a BSEK 
degree and an MSEE degree from Harvey 
Mudd College in 1975. He was bom m Wood- 
limd, CaUfomiaH is uiarried and has two rhil- 
tUen, His outside interesLs int. hide mfjuntain 
bikings liikingj snowslioeing, cross-country 
skiing^ and white- water rafting 



May less • The f^ewbtt Packard Journal 



)Copr. 1949-1998 Hewlett-Packard Co. 
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Workstations: A Case Study in the Challenges 
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In the highly competitive workstation market, customers demand a wide range 
of cost-effective, high-performance I/O solutions. An industry-standard I/O 
subsystem allows HP workstations to support the latest I/O technology. 



I 



. ndustry-standard I/O buses like the Peripheral Component Intercomiect 
(PCI) allow systems to provide a wide variety of cost-effective J/O functionality. 
The desiie to include more mdustry-standard interfaces in computer systems 
continues to increase. This article points out some of the specific niethodoln- 
gies used to implement and verify tlie PCI interface in HP workstations and 
describes? some of the ciiallenges associated with interlacing with indostry- 
standard I/O buses. 

PCI for Workstations 

One of the greatest cliaUenges in desigtiing a workstation system is detenrdning 
the best way to differentiale tl\e design fiom competing products. Tins decision 
determines where the design team will frx'us their efforts and have the greatest 
opportunity to innovate. In the computer workstation industry, tlve focus is 
typically on processor performance coupled wiiJi lugli-band width, low-tatency 
memory connections to feed poweif ui graphics devices. The perfonnance of 
nongraphics I/O devices in workstations is increasing in importance, but tlie 
availability of cost-effective solutions is still tl\e chief concern hi designing an 
I/O subsystem. Rather than providing a select few exotic high-]:>erformance t/O 
solutions, it is better to make sm-e that there is a wide range of cost-effective 
solutions to provide the I/O func;tionality that each customer requires, Smce 
I/O peifonnance is not a priniEoy means of differentiation and since maximum 
flexibility with appropriate price and pcrfcjrmancre is desired, using an 
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ijidustiy-staridard I/O bus tJiat operates wifli liigh -vol time 
cai'ds from multiple vendors is a good choice. 

The PCI bus is a recently established standard that has 
achieved \vide acceptance in tJie PC industry, Most new 
general -piopose I/O cards intended for use in PCs and 
workstations are now being designed Tor PCI. The PCI 
bus was developed by the PCI Special DittMTsl Group 
(PCI SIG)r which WHS founded by Intel and now cT>nsist.s 
of many computer vendors. PCI is designed to meet today's 
I/O performance needs and is scalable to meet fiiture 
needs. Haviiig PCI in workstation systems aUows Lhe use 
of competitively priced cards ali^eady available for use in 
the high-volume PC business. It also allows workstations 
to keep pace with new^ I/O functionality as it becomes 
availablef since new devices are typically designed for the 
industry-standai^d bus iirst and oidy later (if at aU) ported 
to other standards. For these reasons, the PCI bus has 
been implemented in the IIP B-class and C-cUlss work- 
stations. 

PCI Integration Effort 

Integraiuig PCI ini o om- w^orkstation products required 
a great deal of work by both the hardware and software 
teams. The hardware effort included designing a bus 
interface ASIC (application-specific integiated circuit) 
to connect to the PCI bus and then peifomiing functional 
and electrical testing to make sure that the ijnplementa- 
tion would woi'k pioperly The software effort included 
wilting fmiiware to iiiitiali^e and control the bus inteiface 
ASIC and PCI cards and writing de\ice drivers to allow 
the HP-UX* operating system to make use of the PCI 
cards. 

The goals of the effori to bring PCI to HP workstation 
prc>du<*ls were to: 

■ Provide our systems with fully compatible PCI to 
allow tlie sup jD oil of a wide vaiiety of I/O cards and 
functionality 

■ Achieve an acceptable performance in a cost-etTective 
manner for cards plugged h\to tlie PCI bus 

■ Create a sohition that does not cause performance 
degradation in the C'PU-menioiy-giaphics path or in any 
of the other I/O devices on other buses in the system 



■ Slup the first PCl-enabled workstations: the Hewlett* 
Packard B132, B160, CI 60, and CISO systems. 

Challenges 

Implement ing m\ industr>'-standard I/O bus might seem 
to be a straight fonvard endeavor. The PCI interface has 
a thorough specification, rieveloped and influenced by 
many experts in the field of I/O bus architectures. There 
is momentum in tlve indusli^- to make mire the standard 
succeeds. TMs momenium uicludes cmtl vendors work- 
ing to design 1/0 CtU"ds, system vendors w^orking through 
the design issues of the specification, aiid test and mea- 
sm'ement finns developing technologies to test the design 
t>nce it exists. Many of these elements did not yet exist 
and were challenges for earlier Hewlett-Packard propri- 
etaty I/O int-erface projects. 

Although there were many elements in the team's favor 
that did not exist in the past, there were still some signifi- 
cant Lasks m integrathig tliis industrj'-standard bus. These 
tasks included: 

■ Designing the ai'cliitecture for the bus interface A^IC, 
wMch pro\ides a high-peifonnance inteiface betw^een 
the internal proprietaiy workstation buses and PCI 

■ Verifying that the bus interface ASIC does wiiat it is 
intended to do, both in compliance with PCI and in 
performance goals defined by the team 

■ Providing I lie necessar^^ system support, primarily in 
the tbriu of finuware ami system software lo allow 
c:i:U'ds plugged iultj the slots on the bus interface ASIC 
to work with tlie HP-UX operating system. 

With these design tasks identitied, there still remained 
some formidable challenges for the bus interface .ASIC 
design and verificatitju and the softw^aie development 
teams. These challenges includetl ambiguities in tlie PCI 
specification, diffit^ulties in determining migration j)!ans, 
differences in t he way PCI cartls can operate within the 
PCI sppcirii'ation, anil tiie miavadabihiy of PCI caicls 
with the necessaiy* HP-UX drivers. 



May IMS* The HcwIell-PackBrd Journal 



)Copr. 1949-1998 Hewlett-Packard Co. 



Architecture 

The Bus Interface ASIC 

The mie of the bus interface ASIC is to bridge the HP 

proprietary^ I/O bus. callpd the general systeoi coimect 
(GSC) bus, to the PCI bus in the HP B-c'lass and C-class 
workstations. Figures 1 and 2 show the B-ckss and 
C-class workstation system block diagrams with the bus 
interfac-t* .^SIC bridging the GSC bus to the PCI biis. The 
Runway bus sho\Mi ki Figure 2 is a high-speed processor- 
to-nxemory bus.^ 

The bus interface ASIC maps portions of the GSC bus 
address space onto the PCI bLis addi'ess space and vice 
versa. System firmware allocates addresses to map be- 
tween the GSC and PC^ buses and programs diis intbiTiia- 
tion into contjguration registers m tlie bus interface ASIC. 
Once progianimed, the bus interface ASIC perfomis the 
following tasks: 

■ Forward writes trimsactioiis from the GSC bus to the 
PCI biis. Since the UTtte originates in tlie processor, this 
task is called a processor I/O write. 

■ Forward reads requests from the GSC bus to the PCI 
biLs, waits for a PCI device to respond, and returns the 



Figure 1 

HP B -class workstation block tHagram. 
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read data from the PCI bus back to the GSC bus. Since 
the read originates in the processor, this task is called 
a processor I/O read. 

■ Forward Elites transactions from the PCI bus to the 
GSC bus. Since die destination of the write transaction 
is main mcnlor>^ this task is called a direct memory 
access (DAIA) write, 

■ Forward reads requests from tJie PCI bos to the GSC 
bus, waits for the GSC host to respond, and retiuns the 
read data from tlie GSC bus to the PCI bus. Since the 
source of the read data is main memory, this task is 

c idled a DAL\ read. 

Figure 3 shows a block diagnmi of tlie internal architec- 
ture of the bus interface ASIC. The bus interface ASIC 
uses five asynchionous FIFOs to send address, data, and 
transaftion information betu^een the GSC and PCI buses. 



Figure 2 

HP C- class workstatfon block diagram. 
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Figure 3 

A block diagram of the architecture for the bus interfacs ASIC. 



m 




♦^ PCIStuts 



Interrupt 
Controller 



A FIFO \b a memory device that has a port for writing data 
into the FIFO and a separate pfut for reading data ont of 
the FIFO. Data is rear! from the FIP'O in the same order 
that it was ^^Titten into the FIFO. The GSC bus clock is 
asynchronous to the PCI bus clock. For this reason, the 
FIFOs need to be asjiichronous. An asjiiclu'Onoiis FIFO 
allows the data to be written into the FIFO with a clock 
that is asynchronous to the clock used to read dat^ fronv 
the FIFO. 

Data Rows through the bus interface ASIC are as follows; 

■ Processor I/O write: 

c The GSC interface receives both the address and the 
flata for the processor I/O \Mite from the GSC bus and 
loads tliem into the processor I/O FIFO. 

L Tiie PCI interface iirbitrates for I he PCI bus. 



n The PCI interface unloads the address and data from 
the processor 1/0 FIFO ^mcl masters tlie write on the 
PCI bus. 

Processor I/O read: 

p The GSC interface receives the address for the pro- 
cessor 1^0 read froni the GSC bus and loads it into the 
processor I/O FIFO. 

o The PCI interface arbitrates for the PCI bus. 

D The PCI interface imloads the addiess from the pro- 
cessor I/O FIFO and masters a read on the PCI bus. 

n Tiie PCI inteiface wtuts for the read data to return and 
loads rhe data inio tfie processor I/O read leturri FIFO. 

D The GSC interface unloads the processor I/O read 
return FIFO and places tlie read data on the GSC bus. 
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■ D^L^ Write: 

:: The PCI interfece receives both the address and the 

data for the DMA writp from the Kl bus and loads 
them into the DMA FIFO. 

^ Tlie PCI interface loads control infomtation for the 
^Tite into the D]VL\ transaction FIFO. 

D The GSC interface arbitrates for the GSC bus. 

n The GSC interface unloads the vmie conmiand from 
the DMA transaction FIFO, unloads the address and 
data from the DMA FIFO, and masters ihe write on 
the GSC bus, 

■ DMA Head: 

n The PCI interface receives tlie address for the DMA 
read from tlie PCI bus and loads it into the DMA FIFO. 

D The GSC interface arbitrates for the GSC bus, 

D Tlie GSC interface unloads the address from the DNL\ 
FIFO and masters a read on the GSC bus 

c The GSC interface then waits for the read data to 
return aitd loads the data into the D^L\ read return 
FIFO. 

n The PCI interface unloads the DMA read return FIFO 
and places the read data on the PCI bus. 

Architecttirar Challenges 

One of the difficulties of joining two dissimilar I/O buses is 
achieving peak 1/0 bus pcrfomiance despite the fact that 
the transaction slnicturt^s iire different ior bt)th I/O buses. 
For exaJTTple, ti'aasactions on the (iSC ' Ijils ate fixed lengLli 
with not more than eight words per rransaction while 
transactions on the PCI bus are of arbitrary lengtli. It is 
critical to create long PCI transactions to reach peak 
bandwidth iju the PCI bus. For better perfomiance and 
whenever possible, the bus interface ASIC coalesces mul- 
ti):>le pro(*essnr I/O write transactions from the GSC !>us 
into a single processor I/O wxite transaction on the PCI 
bus. For DMA writes, the bus interface ASIC needs to de- 
terntinc the optunal method of breaking variable-size PCI 
transactions into one-, two-, four-, or eighl-%vord GSC* 
transactions. The PCI interface breaks DMA writes into 
packets and comnumicates the transaction size to 
the GSC interface through the it MA nans^ictirm FIFO. 



.Another difBculty of joining two dissimilar I/O buses is 
avoiding deadlock conditions. Deadlock conditions can 
occur when a traiisaction begins on both the GSC and PCI 
buses simultaneously. For example, if a processor VO read 
begins on the GSC bus at the same time a DMA read be- 
gins on the PCI bus. then ihe processor VO read will wait 
for the DMA read to be completed befoi*e it can master its 
read on tiie PCI bus. Meanwhile, the DMA read will wait 
for tlie processor VO read to be completed before it can 
master its read on the GSC bus. Since both reads are wait- 
ing for the othei' to be completed, we ha%^e a deadlock 
case. One solution to thLs problem is to detect the dead* 
lock case and retry or split one of tlie transactions to 
break the deadlock, hi general, ttie bus inteiface ASIC 
iLses the GSC split prof^K-ol to divide a GSC transaction 
and allow a PCI transaction to make forward progress 
wlienever it detects a potential deadlock condition. 

Unfortunately, the bus interface ASIC adds more latency 
to the round trip of D^L\ reads. This exi ra latency can 
have a negative affect on DMA read perfomiance. The 
C-class workstation has a greater latency on DMA reads 
tiian the B-c^lass workstation. This is due primarily to the 
extra layer of bus bridges that tlie DMA reati must traverse 
to get to memory and back (refer to Figures 1 and 2), 
The performance of DMA vends is iinv>ortant lo outbound 
DMA devices such as network cai tis ^md disk controlieis. 
The extra read latency is hidden by prefetching consecu- 
tive data Wijrds from main menioiy with the expectation 
that the I/O device needs a block of data and not just a 
word or two. 

Open Standard Challenges 

The PCI bus specification, like most specifications, is not 
perfect. There are areas where the specification is vague 
and open to inteipretation. Ideally, when we find a vague 
area of a specification, we investigate how other design- 
ers ha^"e iuteriirt^ted the si>ecificatioQ ajid follow the 
trend. With a prop ri clary bus this often means simply con- 
tacting our partners wltjiiu HP i^md rt^solving the issue. 
With an indiLstiy-stimdju-d bus, our partners are not withiji 
the company, so resohdng the issue is more difficult. The 
PCI mail reflecton which is ran by the PCI SIG at 
www,pcsig.com, is somotinies helpful for resolving such 
issues. Monitoring the i\'l mail reflector al.so gives the 
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benefit of seeing what parts nf tJie PCI specification ap- 
pear vagut^ to others. Simply put, engineers designing to 
a sLanclarri need a fonmi for coinnniriicating with otiiers 
LLsing that standard. Wiien closigning to an industi^ stan- 
dard, that formn must by necessity include wide represen- 
tati on fi om tlie industry. 

The PCI specification has guidelines and migration plans 
that rCl card vendois aio oncouraged to follow. In prac- 
tice, PCI card vendors are slow to move from legacy 
standards to follow guidelines or migration plans. For 
example, the PCI bus supports a legacy I/O* address 
space I hat is small and fragmonted. The PCI bus also has 
a memoiy adtlress space that is large and has liigher write 
bandwidth Ihmi the I/O address space. For oIaioos rea- 
sons, the PCI specification recommends that all PCI cards 
map tlw'iT registers to the PCI I/O address space and the 
PCI memor>' address space so systems will have the most 
llexibiiity in allocating base addresses to I/O cai'ds. hi pmc- 
tice, most PCI cmxls still only support the PCI adch ess 1/0 
space. Since we beheved tliat the PCI I/O address space 
woidd almost never be usetl, trade-offs were made in the 
design of the bus interface A81C that compromised the 
pei-formanee of transach:ions to the PCI I/O address space. 

Another example in wliich On* PCI card vendors foOow 
legacy standards rather llian PCI specification guidelines 
is in the area of PCI migration from 5 volts to 3.3 volLs. 
The PC-I specification defines two types of PCI slots: one 
foi- a 5-volt signaling environment and one for a 3J3-voh 
signaling environment. The specification also defines 
three possible tyi>es of 1/0 cards: 5-volt only, 3.3-volt only, 
or miiversal. As their names imply, 5-volt-only and 3.:3-VQlt- 
only cards only work in 5-volt and 3 -3- volt slots respec- 
tively. Universal cards can work m eithi^r a o-volt or 
3,3-volt slot. The PCT specification recommends thai PCI 
card vendors only develop universal cards. Even thougli 
it costs no more to manufacture a univei'sal card than a 
5'V-olt card. PCI card vendors are slow^ to migrate to uni- 
versal cards until voluint> platforms (that is, Intel-based 
PC platforms) begin to have 3.3-volt slots. 

Verification 

Vertfication IVTethodoVogv and Gears 

The pmpose of verification is to ensm'e that the bus inter- 
face ASIC correctly meets the requirements described in 

' Ug^ey refffs tothe M l/Q port space. 



the design specification. In our VLSI development process 
this verification (effort is broken into two distiiicl p^irts 
called phase- 1 i^md phase-^. Both parts have the intent of 
proving that the design is correct, l)ut each uses different 
tools and methods to do so. Phase- i verification is carried 
out on a soft ware -based simulator using a model of the 
bus interface ASIC. Phase-2 verification is carried out on 
real chips in rt^al systems. 

Ptiase-1. The primaiy goals csf phase- 1 verification can be 
SLUimimized as (*oiTe<:;tnc^ss, peiformance, and compliance. 
Proving crorrectness entails showing tliat die Verilog model 
of the design properly produces the behavi{>r detailed in 
the specification. This is done by studying the design 
specification, enumerating a function list of operations 
and behaviors that the design is required to exhibit, and 
generating a suite of tests that verify all items on that 
function list. Creating sets of randomly generated trans- 
ac-tion combinations enhances the test coverage by expos- 
ing the design to mmierous corner cases, 

Perfomiance verification is then carried out to prove that 
the design meets or exceeds all important perfonnance 
criteria. This is v^erified by first identifymg the imjjori ant 
lierfonnance cases, such as key bandwidths and latencies, 
and then generating tests that produce simulated loads 
for performance measurement. 

Fuially, compliance testing is used to prove that the bus 
protocols implemented in the design will work correctly 
with other devices using the same protocol. For a de- 
sign sucii as the bus interface ASIC that implements an 
industry-standard protocol, special consideration was 
given to ensure that the design would be c;om}>atible with 
a spectnmi of outside designs. 

Pha5e-2. TliLs verification phase begu^ vtath the receipt 
of die fabricated parts. Tlie effort dmiug this phase is pri- 
marily focused on testing the physical components, with 
simulation techniques restricted to the supporting role of 
duplicating and better understanding phenomenon seen 
on the bench. The goals of phase-2 verification can be 
summarized as compliance, performance, and compati- 
bility. Therefore, pait of phase'2 is spent proving tliat the 
physical device behaves on tlie bench the same as it did 
in simulation. Tlie heart of i>hasc-2, however, is that the 
design is finally tested for compatibility with the actual 
devices tliat it will be connected to in a production system. 
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VtrtftcatfOn Cliv I lunges 

From the point of \iew of a verification en^per, there 
are benefits mid difliculties in %'erifyiiig iJie implementa- 
tion of an industry -standard bus as compared to a pro- 
prietary bus* One benefit is tliat since PCI b an inciustiy 
standard, there ai^e plenty of off-the-shelf simulation and 
verLficaiion tools available. The use of these tools greatly 
reduces the engineering effort required for verification, 
but at tlie cos!; of a loss of control over the debugging and 
feature set of the tools. 

Tlie mjy or verification challenge (paiticulajiy in phase- i) 
was proving compliance with the PCI st^mdard. When 
verifying compliant^e with a proprietary' standard there 
are lyi^icidly only a few chips that liave to be compatible 
with one iiiiother The design teams invohed can resoh'c 
any ambiguity in the bus specification. This activity tends 
to tnvnlye only a smaO and weU-ciefined set of individLuils. 
In contrast, when verilying coinphance wilii im open stan- 
dard there is usually no canonical source that can pro\ide 
the correct interpretation of the specification. Therefore, 
it is impossible to know aliead of time w^here devices will 
diffei' in their unpleinentation of the si>ecifK^ation. Tliis 
made it somewliat dffictilt for xm to determine the specific 
tests required to ensiu'e compliance with the PCI standard, 
hi the end, it matters not only how faithfully the specifica- 
tion Is followed, but also whetlier or not Llie design Ls c^oni- 
[>atibie with w^hatever interpretation becomes dominant. 

The most signiOcant challenge in phase-2 testing came in 
gf^ting the strateg,v to become a reality Tlie siratc^gy rle- 
pended heavily on re^d ciuxLs with re^d drivers to tlemon- 
strate PCI compliance. However, the HP systems with 
PCI slots were shipped before any PC^I cards with drivers 
were supporied on IIP workstations. Creative solutions 
WTre found to develo[) a core set of drivers to complete 
the testing. However, this approach contributed to having 
to del vug problems closer to shipment than would have 
been optimal. Suiiilaiiy, 3.-1-voI1 shjts were to be sup- 
ported at first shipment. The general unavaitability of 
3,3-voIt or universal (supporting 5 volts and 3.3 volts) 
cai'ds hanipered this testing. These are examples of the 
potent ial dangers of "preenabling" systems with new 
hardware capability Ijefore software and cards to use 
the capability are ready. 

An interesting compliance issue was uncovered late in 
I)hase-2, Due ch^iracteristic tjf the PA 8000 C-chiss system 
is that when (he system is lieavily loatltHl the Ijus intedace 



ASIC can respond to PCI requests with either long read 
latenci£»s (over 1 ps before acknowledging the transaction) 
or many (over 50) sequential PCI retiy cycles. Bodi twliav- 
iors are legal with regard to the PCI 2.0 bus specification. 
and both of them are appropriate given the chrumstances. 
However, neither of tbese behaviors is exhibiied by Intefs 
PCI cliipsets. which are the dominant iniplemeniaLion of 
the PCI bus in tlie PC industry. Sev^eral PCI cards worked 
fine in a PC, Init fidled in a busy C-<'tass system. Tlie PCI 
card vendors had no intention of designing crards that 
were not PCI compliant, but since they only tested their 
cards in Intel-based systems, they never fomid the proLi- 
lem. FcJitunately the card vendors agreed to fix tliis iasue 
on each of theii' PCI cards. If there is a dominant iniple- 
nientation of an industry standai'd, then deviating from 
that implementation adds risk. 

Firmware 

Firmware is the low-level software that acts as the inter- 
face between the operating system and the hardware. 
Firmware is typically executed from nonvolatile memory 
at startup by ilie workstation. We added Llie following 
extensions to the system fimiwai^e to support PCI: 

■ A bus walk to identify and map all devices on the PCI 
bus 

m A reverse bus walk to configure PCI devices 

■ R*>u1 ines to prt Aide boot capability Lhrougli specified 
PCI caj'tis. 

Tlie firmware bus walk identifies all PCI devices con- 
nected to the PCI bus and records memory requirements 
in a resom"ce request map. When necessary, I he linn ware 
bus walk will travei>;e PCMo-PCI bridges." If a PCI device 
hiLs Built-in Self Test (inHT), the BIST is nm, ;md if it fails, 
the PCT device is disabled and taken out of the resource 
request map. As the Iris waJk unwinds, it initializes biidges 
mid allocates resoiu^ces for ^dl of die dowiLSUt^mii PCI 
devices. 

Firmware also supports booting the HP-UX operating sys- 
tem from two built-in PCI devices. Specifically Iinnware 
can load tbe HP-tJX operating system Iroin either a tlisk 
attached to a built-in PCI SCSI chip or fiom a file server 
attached to a built-m PCI lOOBT LAN chip. 

' A PCI-to-PCI bf idge connicls two PCI buses, forwarding transa<:tions from on^ m tha 
other 
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Firmware Challenges 

The ru'st challenge in fimiwai^e wa^ the result of another 
ambiguity in the PCI specification. The specification does 
not define how soon devices on the PCI bus must be ready 
to receive their first transaction after the PCI bus exits 
from reset. Several PCI cards failed when they were 
accessed shorlJy after PCI reset went away. These carcis 
need to download code from an attached nonvolatile 
memory before they wi]\ work coiTcctly. The cards begin 
the download after PCI reset goes away, and it can take 
hundretls of miiliseconds to complete the download. Intel 
platforms delay one second after reset before using the 
PCI bus. This informal compUance requirement meant 
that firmware needed to adci a routine to delay the first 
access after the PCI bus exits reset. 

Interfacing with other ASICs implementing varying levels 
of the PCI specitication creates additi{mal challenge.^. 
Compliance with PCI 21) (PCl4o-PCI) bridges resulted in 
two issues for fiiTiiware, Fiist, liie brirlges addetl latency to 
processor I/O reads. This extra latency stressed a busy 
system and caused some processor I/O reads to timeout 
in the j^rocessor and bring down the system. The fnin- 
ware was changed so that it w^oidd reprogram the proc^ej^- 
sor timeout value to allow for this extra delay. The second 
issue occrn^ when PCI 2.0 bndges are stacked two or 
more layers deep. It is possible to configtuc the bridges 
such that the rigli! combination of processor I/O reads 
and D]V1\ reads will c^ause the bridges to reliy^ each others 
transactions ami clause a deadlock or star\x^ one of tlie 
t^vo reads. Our system finnwaie feces tins problem by 
supportmg no more than two layei^ of PCf to-PCT bridges 
and configuring the upstream bridge with different retiy 
parameters tiitm the downstream bricige. 

Operating System Support 

The HP-UX oi:)f^ rating system contains routmes provided 
for PCI-based kc^rnel drivel's called PCI .scnncf^s. Tlie first 
HP-UX release that provides PCI support is the 10.20 re- 
lease. An hiliaslnic^hne exists hi Uie HP-l'X operating 
system for kenief level drivers, hut 1 he PQ bus mtroduced 
se\eral new requirements. Tire four main areas of dnect 
impact mciude context dependent 170 > driver attachinenf 
interrupt senice routines (ISR), and endian issues. Eac h 
ai'ea recjuires special rouimes in the keniel's PCI ser\iccs. 



Context Dependent I/O 

In the HP-UX operating system, a centralized I/O senices 
context dependent PO (CDiOj module supplies support 
for drivers that conform to it^ model and consume its 
semces. Workstations such as the C-class and B-class 
macMnes use the workstation I/O senices CDIO (WSIO 
CDiO) for tins aljstraction layer Tlie WSIO CDIO provides 
general I/O services to bus-speciftc CDIOs such as EISA 
and PCI. Drivers that are written for the WSIO eniiron- 
ment are referred to as WSIO drivers. The senices pro- 
vided by WSIO CDIO hiclude system mapping, cache 
coherency management, and interrupt seivice Imkage. In 
cases where WSIO CDIO does need to interfju'e to tht^ I/O 
bus, WSIO CDIO translates tlie call to the appjopriate bus 
CDIO. For the PCI bus, WSIO CDIO relies on services in 
PCI CDIO to cany out bus-specific code. 

Ideally, all PC! CDIO services should be accessed only 
thi'oiigh WSIO CT^IO semces. However, there are a 
number of P(T-specific calls thai cannot be hidden with 
a generic WSIO CDIO interface. Tiie^se functions include 
PCI register operations and PCI l)us tuning operations. 

Driver Attachment 

The PCI CDIO is aiso responsible for attaching drivers to 
PCI devices. The PCI CDIO compietes a PCI bus walk, 
identifying attached caids that had been set up by fuin- 
w^are. The PCI CDIO initializes data structures, such as 
the interface select code (ISC) stnicture, and maps the 
card memory base rt^gistcr Next, the PCI CDIO calls the 
list of PCI drivers that have linked themselves to the PCI 
attach chain. 

The PCI driver is called with two parameters: a pointer 
to an ISC structure ( whic:h contains mapping infonnation 
and is used in most suhsefiuent PCI seiirices calls J and an 
integer contaming the PC! device s vendor and device IDs. 
tf the vendor Luvd device IDs match the driver's interface, 
the ciriver attach routine can do one n^ore check to verily 
its ownership of the device by reading the PCI subsystem 
v^endor LD and subsystem ID registers In the configuration 
space. If the driver does own this PCI device, it tyxiically 
initializes data structures, optionally hnks in an interrupt 
senice routinCj initiahzes and claims the interface ^ and 
then calls the next driver in the PCI attach chaui. 
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tnterrupt Servii:# ftoutines 

Thi? PCI biis uses lp\Tl-sensiiive. shared interrupts. PCI 
drivers thai use interrupts use a WSIO routine to register 
their inierrupt senice routme with the PCI CDIO. When a 
PCI interface card asserts an interrupt, the operating sys- 
tem caMs the PCI CDIO to do the initial handhng. The Ptl 
CDIO detenniiies which PCI interrupt line is asserted and 
then calls each driver a^ociated with that interrupt line. 

The PCI CDIO loops, calling drivers for an interrupt line 
until the intemipt line is deasserted. Wi\en all intemipt 
lines are deasseited, the PCI CDIO reenables internipts 
and returns control to the operating system. To prevent 
deadlock, the PC-I CDIO has a fmite (although large) nuui- 
ber of times it can loop tlirongh m\ interrupt leve^ before 
it will give up and leave the inlerni|)l line dii>abled. Once 
disabled, the only way to reenable tlie intemipt is to re- 
boot the system. 

PCI Eiidian Issues 

PC' I drivers need to be cognizant of enriian issues.* The 
PCI bus is inherently httle endian white the rest of the 
workstation hardware is big endiaji. Tliis is only an issue 
with card register access when the register is accessed in 
(quantities other than a bv1e. Typically there are no endian 
issties associated with data payload since data payload is 
usually i)yte-orienled. For example, network data tends 
to be a stream of byte data. The PCI CDIO provides one 
iuetb<Kt fur handling register endian issties. Another 
method lies in the capability of. some PCI interface cltips 
to configure tlieir registers to be big or little endian. 

Operating System Support CKallenges 

We ran into a problem when third-party card developers 
were porting their drivers to the HP-UX operating system. 
Their drivers only looked at device mid vendor identiHers 
and claitried the btiill-in IJVN inappropriately. Many PCI 
interface cards use an industi7-standard bus interface 
chip as a front end i\nd therefore share the same device 
and vendor IDs. Foi' exainph\ se\'eial vendors use the 
Digital 2 LUX fimiHy of PCI-to- 10/100 MbiLs/s Ethernet 
LAN controllers, with each vendor customizing other 
parts of the network interface rarfl with perbajis diffefent 
physical layer eittities. It is possible that a workstation 

* little erdtan and liig endian are conveniions rhei detme how byte atJdiess^s are as^ 
signed lo data tliat is two Of more bytes long.. The little endian convention ptaces brtes 
with lowGT significance at lower byte addresses-, IThe word is stored "little-end-tirst ") 
The big mdm corwenrion places bytes with qwam significance at Itjwer byte atl' 
dreises [Thfi word is stored "big-end-firsi "] 



could be configtired ^^ith nniiltiple LAS' interfaces Jia^itig 
tite same vendor and de\ice ID with different suhsj^tem 
IDs controlletf by separate drivers. A linal driver attach- 
ment step was added to vmfy the driver s oA^Ttership of 
the de\ice. This consisted of reading the PCI subsystem 
vendor ID and subsystem ID registers in tlie coiifigoration 
space. 

The IIP-ITX operating system does not have the ability to 
allocate contiguons physical pages of memory. Several 
PCI cards (for example. SCSI and Rbre Chanitel ] require 
contiguous physical pages of nieinor>- kyr bus tiKi-^ter task 
lists. The C-class implementation, which allows vinual 
D^L^ tlnroiigh TLB (translation lookasitle buffer) entries, 
is capable of supplying 32K bytes of cotttigtious ntemoty 
space, hi the case of the B-class workstation, wliich does 
not support virtual DM\, the team had to develop a work- 
around that consisted of preailocating contiguous pages 
of memory* to enable tliis class of de\ices. 



Conclusion 



PCI and Interoperabilitv* We set otit to integrate PCI into 
the UP workstiitiuus. Tito goal was to proxide otir systems 
with access to a wide variety of industry-standaid I/O 
cards and functionality. The delivery of this capability 
i"eQtiired the creatitin rutd verification of a bus interface 
ASIC and development of the appropriate software sup- 
[joit in fuTiiware and in the HP-UX operating system. 

Chailetiges of Interfacing with Industrv Standards^ Tliere 
are many advantaf^es to intt^rfacing wiiti ait inilustry 
standard, but it also comt's with m<my chidlengivs. In de- 
fining and implementing an I/O bus archil ecttire, perfor- 
mance is a priniaiy c^oncenn. Interfacitig proprietary^ and 
industry-standard btises and achieving acceptable perfor- 
mance is difficailt. Usually the two buses are designed witli 
different goals for different systems, and determinhig the 
correct optiniizations requires a great deal of tiesting and 
redesign. 

Maintaining compliance with an industry- standard is an- 
otlier rnjjyor chcillenge. It is often like shooting at a mo\ing 
target. If another vendor ships enougli large volumes of a 
nonstandard feature, Ihen that featiue becomes a de facto 
part of t lie standard. U is also very dilTicult tf) prove that 
the specification is met. In the end, the best verification 
techniques for us involved sit njily testitig the l>tiH interfac^e 
ASIC against as many devices as t>ossible U\ find where the 
interface broke down or performed pocjrly 
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Finally, it is difficult to drive development and verification 
unless the functionality is critical to the product, being 
shipped. The issues foinid late in tht^ development cycle 
for ttie bus interface ASIC could have been found earlier 
U't!ie system had requirt^d .sjiec^ific PC! I/O functionality 
for initial shipments. The strategy of preenalilinjg the 
system to be PCI ct)inpatible before a laige number of 
devices became available made it difficult to achieve the 
appropriate level of testing before the systems were 
sliipped. 

Successes. The integration of PCI into the HP workstations 
through design and verification of the bus interiace ASIC 
and the developmenl of the necessary software components 
has heen quite ^uccessfuL The goals of the PCI integration 
effon were lo provide fuNy compatible, high-perfonnance 
F^Ci capability in a cost-effective and timely mannen The 
design meets or exceeds all of these goals. The bandwidth 
avaiiabk lo PCI cards is within 98 percent of the bandwidth 
available to native GSC cards. The solution was ready in 
time to be shipped in the first PCI-enahled HP worksiations 
B132.B160. CI60. andClHO. 

Tlie bus -bridge ASIC and associated yoftwai'e have shice 
been enhanced for two new uses in tlu^ second genei ation 
of PCI on IIP workstations. The first enhancement pro- 
vides support for the GSC-to-PCI adapter to enable specific 



PCI functionality on HP server GSC I/O cards. The sec- 
ond is a version of I he bus interface suppcnting GSC'2x 
(higher bandwidth GSC) and 64-bit PCT for increased 
bandwidth to PCI graphics devices on HP C200 and C240 
workstations. 
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omputers have had a profcjimd effect on how compardGs conduct 
business. They are iised to imi enterprise business software and to automate 
factory-floor produce lion. While this haf? bt^on a great benefit , the level of 
coordination between computei*s nmning luirelated apphcation software is 
usually Ihnited. This is because such data transfers are difficult to implement, 
often reqtiiring manual intervention or customized software. Untd recently, 
off-the-shelf data transfer solutions were not available, 

HP Enterprise Link is a middleware software product that increases the 
effectiveness of companies involved in manufactiu'ing and pi'od action. It allows 
business management software ruiming at the enterprise level, such as SAP's 
R/3 produc:t, to exchange information (via electronic transfer) with software 
applications running on tiie factory floor. It also allows software appiicatioas 
mmung on the factoiy^ floor to exchange information with each other. 

HP Enterprise Link is available for HP 9QD0 computei"s nmning the HP-UX* 
operating system and PC platfonns running Microsoft's Windows® NT 
c^erating system. 

Tliis article will discuss the evolution of the link between business software 
systems and factory automation systems, and the ftmctionality provided in HP 
Enterprise Link to enable these two environments to commujiicate. 

Background 

Initially only large corporations could afford computers. They ran batch- 
oriented enterprise business softwai-e to do payroll, scheduling^ and inventory* 
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As the cost of computing dropped, smaller rompanies 

began using computers to run business sofhvare. and 
companies involved in manufacturing began using theni 
to automate factory-Qoor production. 

Although factory-floor automation led to improved effi* 

cienry and producti\ity. it was usually conducted on a 
piecemeal basis. Different portions of an assembly line 
were often automated at different times and often with 
different computer equipment, depending on the capabil- 
ities of computer equipment available at the time of 
purchase. As a resitlt, today's factoiy-floor computers ai'e 
usually isolated hosts, dedicated to automating selected 
steps in production. While various lactoi^'-tloor functions 
are auloniated, they dc} not neccssaiily coiTmiiuiicate with 
one another They are isolated in "islantls of aytornatioii." 
To make n\atters worse, the development of program- 
mable lf>gic conti'ollers (PLCs) and other dedicated '^smarl" 
factoiy-floor de\ices has increased the number of isolated 
computers, making the goal of integrated factory-floor 
computation that much harder to achieve. 

While production software was generally used for smaller, 
more isolated problems, business software was used to 
solve larger company-witie problems. I^urtliemiore, while 



production software was more real-'time oriented, busi- 
ness soft%vare was more transaction and batch oriented. 

These differing needs caused business systems to evolve 
i^ith little concern for the kind of computing found on the 
factor>' floor Similarly production sysiems evolved with 
little concern for the kind of computing found ai tlie 
enterprise level As a residt, many enterprise-level business 
systems and factor>"-floor computers are not able to iiiter- 
commimicate. Figure 1 shows an example of the com- 
ponents d\at make up a typical enterprise and factory- 
Door enviroimient 

The net effect is that today companies And it difficult and 
expensive to integrate factory-floor systems with each 
other and v\ith biisiness softv^-ai-e nmning at the enterprise 
level. This is iinfortimate because the dyTiamic wdture of 
the mai'ketplace and the desire to reduce inventory levels 
have tiiade the need for sucli integration very high. 

Marketplace Dyn amies 

Over the last decade, the ruarketplace has become in- 
creasingly dynamic, forcing businesses to adapt ever more 
quickly tti changing mai^ket conditions. Computer systems 
now experience a continuous stream of modifications and 



Figure 1 

Computing at the enterprise and factory-floor tevefs. 
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upgrarips. CleneraUy, tlus has forced business systems to 
adopt more real-liiiie bL^hiivicjiH iind production systems to 
become more llexjlile. It has also increased U\e fret[uency 
and volume of data transferred between business and 
[jrodiK*t]on systems and between the many production 
systems. 

Tliere has always l>een a requirement to transfer inibrina- 
tion between computers in an organization, botJi hori2;oii- 
tally between computers at llie same fuiic:tional level, and 
veitictilly between computei*s at different functional levels- 
In the past, mauuiil data enliy was km often-used approach. 
Hard-copy printouts generated by business nianagenieni 
systems w^ould be pro\dded to operators who manually 
entered the infomiation iiito one or more production 
systems. Although this was an acc:eptable approach m the 
past, such an approach is i^ot sufficiently responsive in 
today's dynamic business emironment. As a result, the 
need for electronic data transfer capability betw^een the 
varioLts business management and production level 
computers is m^w veiy high. 

Electronic Data Transfers 

Integrated business software with built-in support, for 
data transfers between components is sometimes used 
at the business management leveh While this minimizes 
tite effort reqmred to exchange data betA^^en the various 
components of enterprise business systems^ it is often 
inflexible and restrictive with regard to what caji be 
exchanged and when exclumges occur. 

Orgaiuzations that use a variety of business software 
packages, rather than a single integi'ated package, have 
typitrally developed custom software for electronic data 
tnaisfers between packages. Uidbrtunatt^ly majket]jlace 
dynaiuics require custom software to be constantly re- 
worked. This ongoing rework forces companies to either 
maintain in-house t>rogrimtming expeitise or repeatedly 
hire software consultants to implement needed clianges. 
As a result, custom data transfer softw^are is not oidy ex- 
pensive to develop but also costly to maintaui — especially 
if changes must be implemented on shoit notice. 

On the factory floor softwai'c programmers have been 
eniijloyec! to develop custom data transfer solutions tltat 
allow the different islands of automation to conmiunicate. 
As pre\iously noted, diis approach is cliiflcult to implement 
and exi>easive to maintain. In addition, this approach is 
often inflexible since the resultuig soft%vare is usually 



developed assuming tliat the configuration of factoiy- 
floor systems is largely static. 

Whtm new" equipment and application sofl ware are to be 
intc^giated intcj the oveiall system, softwai'e programmers 
dont just prepare additional custom softwai'e. They must 
also modily the existing custom softw^are for iiU applica- 
tions involved. For tMs reason, custom software is often 
avoided, imd electronic data transfer capability is fre- 
quently confined to transfers between equipment and 
software supplied by the same manufactiu'er. 

Differences in hardware (and associated operating sys- 
tems) and differences in the software ap]>hcailons tltent- 
selves cause numerous apphcation mtegration problems. 
Here are a few examples: 

■ Data from applications running on computers ttiat 
have proprietaiy hardware architectures and operating 
sy^stems is often not usable on otlier systems. 

■ Different applications use different data types according 
to their specific needs. 

■ hicompatible data structtires often result because of the 
different gi'o apings of data elements hy software apphca- 
tions. For example, ^m element with a conmion logical 
defuution in two applications may still be stored with 
two dii^erent physical I'epresentations. 

" Applications written in different languages sometimes 
inteq^ret their data values different ly For example 
C and COBOL interpret binary numeric data values 
differently 

What is needed, therefore, is an off-the-shelf product that 
is specific:ally designed to intercoiuiect applications that 
w^ere not originally designed to work together That 
product must automatically quickly, efticiently and cost- 
effectively mtegrate apphcations having incompatible 
programming interfaces at Ihc^ same or different fujic- 
tional levels of an orgaiuzation. IIP Enteq^rise Link is 
such a product. 

HP Enteipnse Link is an interactive point-and-click soft- 
w^are product that is used to connect software applica- 
tions (such as business plaiming and execution systems) 
to control supenisoiy systems foimd on the fattoty floor 
HP Enterprise Link greatly reduces the cost and effort 
required to intercomiect such systems w^hile eliminating 
the need for custom soft w^ue. 
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The Dsta Transfer Problem 

The problem of transferring data from one software appli- 
cation to anotJier is concepumUy simple: just fetch the data 
from one system aiid place it in another. In practice the 
probleni is more complex. The following issues arise when 
tning to implement electronic data transfer solutions: 

■ Tliere must be a way to obtain data from the software 
apphcailon sening as the data source. Such access, for 
example, might be provided by a library of callable C 
functions. 

■ There must be a way to fon^^ard data to the software 
application serving as the data destination. For example, 
data niiglit be placed in messages that are sent to the 
destination application. 

■ There must be a specification of exactly what to fetch 
from the source application and exactly wiiere to place 
it in the desimation application. 

■ The data being transferred must be trmislated from 
the format pro\ided by the data .source to the format 
required by the data destination. 

■ There must he a specillcalion of the circumstances 
undei' wliicrh dalii shcmld be transferred and a way to 
tk*tect when these circmnstances occur. 

AH of these issues are addressed in HP Enterprise Link. 

HP Enterprise Link 

HP Emeiprise Link product consists of the three compo- 
nents shown in Figure 2: 

■ An interactive conllguration tool. This interactive 
windo%v-based api)hcatJon allows users to ilirect the 
Hiovement of data betw^een tw^o software apphcations. 

■ A data server. This noninteractive proc:ess ruiYs in the 
t>ackground. It moves data in accordance with ihe diiec- 
Uves Lliaithe user spec ifieil with the conilguratitni tool. 

■ Conllguration files. This is the set of mappings and 
trigger criteria created by users. The data is stored in 
configuration files. These files are created and modified 
by tlie configuration tool and read by the data sender, 

Lin king Comiion finis 

The HP Euteri->nse Link components listed above liave the 
comnmu goal of enahliug useis to create middleware that 
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maps components with different interfaces together for 
data transfer. 

hi HP Enterprise Link, the combination of a single source 
address and a single destination address is called a tnap- 
ping. A tinit of data at the specified source addi-ess is said 
to be mapped to the specified destination ad tire ss. In 
other words, it can be read Irom the specified source 
address and written to the specified desfination address. 

Although a mapping deals wnth the txiuisfer of a single 
unit of data. reaJ-worid situations usually require the 
transfer of mmiy units of data simultaneously. Therefore. 
HP Entcnirisc^ Link collects inappings into groups called 
methods. A melliod c(jntains ont^ or more mappings. 

Mappings describe what to transfer and where to transfer 
it. whereas I riggers describe exactly when to rto the 
traiusfer. Data is aciuail.v translened wlieiiever a specified 
trigger condition is satisfied. This condition is called tlie 
trigger criterion. There are many possible triggei' criteria 
such as: 

■ Wlienever a unit of datii at a ^petrified soiuix* address 
changes value 

■ Whenever a luiit of data at a specified source address is 
set to a specified value 

■ Whenever the source data becomes available — ^^such as 
aniNing in a message 

■ Ai a preset time of the day or a preset day of the week. 
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HP Enterprise Link considers trigger criteria to be part of 
the definition of a nietiiod. AM the mappings for a single 
metliod sliaie Llie same trigger criteria. Wlienever the 
tiigger criteria are met, HP Enterprise Link transfers — in 
unison — all the data specified by \l\e method's mappings. 

Mitltiple inetliodK caii simultaneously exist in HP Enter- 
prise Link. For example, a user can create one int^thod to 
transfer a particular pi'oduction recipe from a business 
enterprise ^stem down to a factory-floor control system. 
Ctjnversely, raw-material consumption information for 
the recipe ciu'rently in producition could be transferred 
periodically from the factory-floor control system up to 
the business enterprise system^ using a second method. 

The Configtiraticifi Tool 

The HP Eiiterj^rise Link configuration totil provides users 
w^ith a \aew of each software application's name space. 
•ami the tool graphically depicts what <lata to trmisfei" and 
imder what circumstances such transfers should occur 
fFigure 3)- 

The HP Enterprise Link configuration tool Ls composed 
of coiimnmication oLi»jects and a grapliiCiiJ user interface 
(Gt I). Communication ob^jects ;:ue ustKl to cjbtain name- 
space data that is unique to eac:h application and to pro- 
vide applicatioii-specillc windows. The configuration tool 
pro\ides the user with an eiusy-to-use point-and-cUck style 
GUI. 



Figure 3 
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All dependencies on pailic^ula* software apphcations are 
encapsulatcfl in communication objects. Tlie configura- 
tion tools communication objects pi t> vide tlie following 
functionality: 

■ They fetch namespace information fi'om conmiunicating 
software apphcations for presentation to the user. 

• They provitlc routines to create and manage application 
d€»pendent control i>anel \^idgets, such as those used 
to specify triggers unique to a particular softw^are 
application. 

■ They provide routines to teil the GUI exactly what func- 
tionality is supported by a communication {)bjt^ct. For 
example, can the application softwaie ser\x^ only as a 
data source (supply data values), or can it serve as both 
a data soui^ce and a data destination (both supply and 
use data values)? 

There are three important winciows in the conligmmion 
tool's GUI: the Edit Method window, the Edit Mapping 
window, and the Trigger Configuration window. 

Edit ivtapping. Tlie Edit Mapping window is used to create 
new mappings (Figure 4), The namespaces of both the 
source software application and tlie destijiation software 
application are shown. They are graphically displayed 
as tree diagrams. Tliis makes it easy for users to specify 
whic!a data to move where. They don't have to rentember 
the names of data souices or data destinations. Instead 
they just choose from the displayed list of possibilities. 
The side -by-side disfilay of application iiamespaces makes 
it much easier to integrate the applications. 

Tree diagrams cire used because they make large name- 
spaces manageable. A Unear namespace dispiay was 
rejected early in the design of HP Enteiprise Link because 
a flat list representation would only be manageable with 
stiftw^are applications ha\ing a small namespace. .Ajiother 
advantage of tree diagrams is that most users are already 
famihar with them from file selector window^s found in 
many software appUcatioiis. 

To create a new mapping the user selects an item from 
the Mapping Source tree diagram and an hem from the 
Mapping Destination tree diagram, and then clicks the Add 
Mapping button . A new mapping is added to the mapping 
table tlisplayed on the Edit Method window^ (Figure 5). 
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Figure 4 

The Edft Mapping window 




Multiple static' inappings caii be created in a single step 
iLsing braiicii assignniente, Thit? rtH|uires that the last com- 
pcjiieat o\' the source and dt^sUnatitjii atlrb^esses be identi- 
cal (so that aiJiJiopriate mappings ttui be automat ieally 
created). Majjpings can also be automatically cn^ated at 
the tinie methods art^ triggered- Tliis is called dynamic* 
mapping ami requires the user to .specify al^orilhuis iliat 
can seleci source addresses cUiti ti'iUKslbrm tiiese addresses 
to valid destination addresses. 

Edit Method, Tlie Edft Method window (Figure 5) displays 
a nielhod's mappings as a twtMoluuui table titled Map- 
pings. Source addresses appear in ilic left column and 
destination addresses aji|jear in tlie ri^ht. The data sei'v er 
transfers mapped data fr(jin source addresses to destina- 
tion addresses in I he saau^ rjrder as tlie mappings are 
listed in this table. The Mappings table makes mappings 
both explicit, and intuitive to the user. 



This window allows the user to specify in wliich direction 
to transfer data. All of a nietliod's mappings specify data 
traiLsfers in one rlirection — frnni one software apjilit^ation 
to another. The Edit Method window also allows IIk^ user 
to specify how to respond to errors that occur during data 
transfei^s. This wiU be described later in more detail 

Trigger Configti ration. The Trigger Configuration window 
is used lo rlciim' trigger criteria (Fijjure 6), This wiiuluw 
displays all iK>ssibie triggers (o Uu* u.ser, us well as die 
currently c^onllgured trigger criteria, 'fhe Trigger Configura- 
tion wiudfjw is desigucd to make setting up trigger criteria 
explicit ai\ti intuitive for Uie liser. 

The Trigger Configuration window is spill inlo three groups: 
time triggers, triggers unique to the source application, 
and trtggt^i s miique to the destination application. 'Tinie 
triggers allow the user to specify thjil dat^ mapping start 
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Figiirt? 5 

The Edit Method window. 




at some specified time and repeat at a specified time 
iiiten^al, but be synchronized to a specified iioiir/miniite/ 
sec ond of ! he d ay /lio ii r/ni i n ii ie . 

Triggers nniqne to the source application, such as the 
HTAP ( real-time applic;aLiun plaiibrmj triggers slio^^m in 
Figure 6, allow data to be mapped when sometliing inter-' 
esting liappens in the sciiircc application. For the RTAP 
triggers in Figure 6 interesting events include a database 
vakie change or the occurrence of an RTAP database 
alaniL Data cm\ also be mapped when sometlung interest- 
ing happens in the destination application. 

Thus, triggers allow data transfers to be pushed from the 
somx'e applit:ation, pullcci frtjiu ihe destii\atioiY applica- 
tion, or schednlt?fl by time. 

Summary^ Using the windows just described, users can 
cieate nielli ods witli the configm^tion tool These methods 
specify one or more n^appings and associated trigger 
criteria. Tltis infonnation is saved in one or more configu- 
ration IHes. Tiie data server Ltien I'eads these configuration 
fUes to impk^nu^iU Ihc^ ustT's on^thods. 

The Data Servar 

Th(^ IIP Enter])! ise Link data ser\t^r is composed ot~ com- 
munication objects, a trigger manager, aiul a mapping 



Figure 6 

The Trigger Configuration window. 




engine (Figure 7). Coimnunication objects deed with the 
problems of generating triggers and getting data into cuid 
out of software apiilicatioiis. Tlic trigger manager deals 
with tlispei'sing Trigger Configuration data, coordinating 
trigger events, antl notiMug the mapping engine of trigger 
events. The niai^]>ing engine detds with the proljlems of 
reading configiuation files, responding to triggei^, mapping 
source addresses to ciestinarion addresses, and transfoiTU- 
ing \lw data as it is beuig mapped. 

All software-applicarion tlependencies are encapsulated 
in communication objects. C'onununi cation objects serv^e 
as translators between external software applications and 
the data sender s mapping engine^ — they translate the 
so ft w- are application's native apphcation i>rogriun inter- 
fact^ (APlJ to the interface used by tlie mapping engine. 

The interface betw^een a conimimication object and the 
mapping engine is stand aidized. with ail conununic^ation 
objects iLsing the same interface. For data that is being 
ti^ansferred, the interface c^oiisists solely of address-value 
pairs, where the address is from the application soft- 
ware^ s tiamespace, and the value is encoded in a neutral 
fortu. Tims a communication object only needs to be 
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Figure 7 

Ths components of the HP Enterprise Link datB server 
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aware of its own namespace mid how to convert between 
the softwai"e application s propnetarj' data foiniats antl 
the neutral HP Enterprise Link data format. For rdggetB, 
the interface consists of weU-doconiented interactions 
between the trigger manager and the connnunication 
objects. 

Connnunication objects are usually distributed. They ai*o 
split into two parts that are interconnected by a comminii- 
cai ion channel such as a TCP/IP socket. One part of the 
t>bjecl is iiic(»qKirateti into the HP Enteri>rise Link data 
S6i'\^er process, while the other rims on the same machine 
^Ls the coiTesponding scjftwiire application. When a com- 
mimication object is not split intf j two parts, the object, 
tl le data seiner, ajid the soft waie applic atitin must iim on 
the same mactiine. 

Commimication objects communicate with tiieir corre- 
sponding softwaie applications through whatc*ver mecha- 
nism is a%^ailable. For example, this could he tluough a 
serial port., shared memory, shared files, TCPAP sockets, 
or an application program interface (API). 

When a communication object transfers data, it translates 
data between the fonnat iised by the somce software ap- 
plication and the nc^uiiid Ibnnat requiietl by the mapping 
engtne. For example, for numeric values, a commuiiica- 
tion obje<l may have to trai\siat.e between l>inar>' lEEE-754 
floating-point ff ninat and the mapt)ing engine s iieuiral 
fommt. 



Ill practice, not all data transfer attempts vnH be* success- 
ful. For example, a paiticulai' source address might iiave 
been deleted, or a destination address may no longer 
exist. The conftgrnation tool is used to specify what the 
niapping engine should do hi tliis situatioUj and thc^ data 
server mtast detect the condition and deal with it appro- 
priately. When data transfer attempts fail, the user can 
!iave the data server do any one of the following: 

« Continue mapping data (ignoring the enor ) 

■ Abort all subsequent mappings associated wii h ilie 
cuneni nu^thod 

■ Abort all subsequent mappings and all subsequent 
methods triggered by the curreiu ixigger event (if 
multiple methods were simultaneously triggered). 

The ijiterface between the conunmucation object and 
the ntapping engine is designed to su]7p(jrt transaction- 
orienied data transfers, usii\g commit and roilhack. This 
functionalily comes into play when mapping attempts fail. 
It allows the data sener to imdo (roll back) all data tr^ms- 
feis dfine in all cmrenlly i^rocessed mapphigs associated 
with the method's ciirrent trigger evei\t. 

The itiinniiig Data Server 

TOien the HP Enterprise Link data server starts up, it reads 
the configuration files that the user created ^\ith the con- 
figuration tool. It then prepares to deal with the specified 
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trigger criteiia, usually by notifymg the appropriate 
coniniujii< fiUoii oljjecl U> detocl it. Finally, it cnttnf> an 
event-tiiiv en iiiotle, waiting for the trigger criteria of any 
c^ontignrecl method to be satisfied. 

Wlieii either a source or destination communication 
ot)ject in the data server detects that a method s trigger 
crii eria have been satisfied, the object informs the data 
sen er I rigger manager that a method litis beeti triggered. 
This stalls the mapping engine. Alternatively, if tlie data 
seiver trigger manager detects that a methods timo-h^Lsed 
trigger criteria have been satisfied, the* niappitig etigine 
stalls. 

TOien triggered, tlie mapping engine requests that the 
source communication object provide the ciurent data 
values at the metliod's configured som'ce addi^esses. The 
source communication object obtains these values from 
the software application, translates the format of all 
fetched data values to a neutral foniiat, and passes tlie 
result to the mapping engme tLs ad [Iress- value pmrs. wit h 
one such pah' for eatrh of ttie method's defineti niai)pings. 

Tlie data ser\'er mapping enghie looks up the desthiation 
address that corresponds to each source adtiress. This 
lookup results in a new list of address-^^ahie pairs, with 
the address now being the desthiation address, and the 
vaku^ michanged (and still oxpressc^d in the mapping 
engine's neutral format). To minimize the impact on per- 
fonnance. lius lookup is implemented ushig a hash table. 

The mapping engine sends the new^ list of address- value 
pairs to the destination communication object. The des- 
tination communication object converts the received 
values into the format reqiui'ed by the desthiation software 
applicat ion, and writes the convened residt to the sjjeci- 
fied addresses in the destination sofLwaie api^lication. 

ComiTiunicatiori Objects and Software Applications 

There are two liuitlanieiitai w^ays for software applications 
to provide communication objects access to their data: 
the reqm\sl-r(qilij method and the spontafwous-inessage 
method. 

In the request -reply method, the conuuunication object 
sends a software appUcation the adthess of a wanted data 
uiut in a request and receives its ctirrent value in a reply. 
Willi tins method the rommimication object controls the 
data transfer. It detennines which unit of data to read and 
when to read it. Structured Quer>^ Language (SQL) and 



real-time databases are two examples of software applica- 
tions that employ the request-reply method. 

Ill the spontaneous-message met hod, commimi cation ob- 
jects receive fiata, imually £is messages, from the soft wai*e 
application whenever the ai^plication chooses to send it. 
With this method the software application controls the 
data transfer It determines w'hich data to provide and 
when to provide it. SAP s R/3 product is an example of 
a software application using the spontaneoiis-message 
method. 

The metfiod that a software application einplo^^ t*et:pEiDv!llir 
external data access deternunes the trigger criteria that 
are possible for that applit alioirs c^onununication object. 
Tlie request-reply method allows event, value, and time- 
based trigger criteria since the communication ol>ject 
controls tiie data transfer. The spontaneous message 
method is luniied [o value-based triggering (essentially 
filtering) because the softwau'e application providing tlie 
data controls the data transfer. 

Spooling 

The HP Knterj^rise Link data serve r''s commnniralion 
objects must cope with communication failures. This 
means that outgoing data must be locally buffered until 
a commmiication object verifies that the apt>lication soft- 
ware, when acting as a desthiation, has successfully re- 
ceived it. It also n^eans that incoming data must eitlier be 
safely transferred tiirough the mapping engine or locally 
buffered when a conuiiiuiication object accepts data from 
the source application software. 

Spooling is especially important if tlie source application 
is separated from the HP Enteiprise Link data sewer by 
a v^ide area network (WAN). WANs are considerably less 
reliable than local area networks, and thus are more likely 
to lose data. 

In a typical HP Enterprise Link instahation the data server 
mns on a machine located near or on the factory floor. 
Production ordei^s are dowTiloaded from the enteiprise 
level to HP Enteipiise Link as soon as they are available. 
The downloaded data is buffered at the factor>' until it is 
needed. Using HP Enteiprise link hi this way reduces the 
probability that the factor>^ would lack unprocessefl pro- 
duction orders if tiie WAN is down for a prolonged i>eriod 
of tune. 
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Buffered data must be presen ed ev^i\ if the IIP Enteiprise 

Liiik host machine Ls shut do\vn or crashes. To do this, liP 
Enterprise link stores buffered data iii disk-resideiii spotjl 
files. 

The aniount ol storage used to hold buffered data nuist be 

restiicted to protect the host computer from failure caused 
by insufficient resources. HP Enterprise Link can hmit ilie 
size of spool files by controlling: 

■ The maxiii\imt size of spool storage 

■ The maxiiiimn number of messages buffered 

■ The age of the oldest message buffered. 

The user caB set any one or all of these limits, using the 
i IP Enterprise Link corifigLuaiion tool. 

Tracing 

HP Enterprise Link allows the data being transferred 
to be monitored by the user The nuinitorLng is called 
t racing. Ti"acing is useful for creating ^m audit trail of the 
trajisferred data and for debugging inid testing methods. 
TVacing does not affect the data being transferred. 

The configuration too! is used to etiable and disable trac- 
ing, but it is the datfj ser-\^er that generatt^ trace messages 
when tracing is enabled. 

Data can be traced at a munber of different internal loca- 
tions within the data server (see Figure 8). Some of the 
fi »nns in whicli trace results can be expressed include: 

■ Data as received by a data server comniimication object 
from a source software application. Tl^is trace data is 
expressed using the source software apiilication's native 



data farmat and includes the source address, the \'alue 
received or read, and tlie time of transfer 

■ Data as .sent by a data sener communication object to 
the destination software application, TItis trace data is 
expressed using the destination software applications 

native data format and includes the destination address, 
the \^ue sent or written, and the tune of transfer, 

• Data being mapped by tlie mapping engine. This ti-ace 
data is expressed tising the data sender mapping engine's 
neutral data format and includes the source address, the 
destination address, the \'alue transferred, and the time 
of transfer. 

Error messages reported by the mapping engine or by 
communication objects can also be mcluded in the trace 
oiitj3ut. Tills abihty ensures that the relative sequencing of 
data transfer messages and error messages is preser\ ed, 
which greatly aids the user when ti7ing to troubleshoot 
mapping problems. 

Server Data Flow 

HP Enterprise Link allows the flow of data in the data 
ser\'er to be interrupted at a ntiniljcr of differeitt internal 
points (see Figure 9). This is useful fcjr isolatutg the 
effect.s of data mappings during del)ugging and testing. 
Wlieti an information flow is intermpted, data does 
not pass the point of interruption; instead, the data is 
discarded. 

Tlte flow of infonnation being transfcrriHl from a commu- 
riication object to a software application cati be inter- 
titpted. Interrupting the flow here allows the data server 
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Figure 8 








Tracing data that is transferred between applications. 
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Figure 9 

Interrupt locations in th& data server 
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to ii>a(l fujiii mapped source addresses, map to now des- 
tiiiiitioii MddreaseSn aiid then disci^uti thr data just before 
it would have been wiitteii u> the* d(^Htillati(Jn stjftware 
applicatioiL 

The flow of information being transferred from a software 
apph cation to a communication object can also be inde- 
Iiendently inteniipted. Interrupting the flow here allowis 
the data server to ignore afl data sent to the communica- 
tion object by tlie som^e software atiplication. 

Data Integrity 

The HP Enterprise Link data server is carefidly designed 
to preserv^e the integrity of the data being mapped and 
to map the data oxactly once for each tiigger event . The 
design was iiifJucnced by considering liow to react to 
coiimiunication channel failures and data server process 
terminations. TJie circumstances that couici cause the 
data server process to tenninate are tlie f oUowing: 

■ If a person or softw^ai'e process explicitly kUls the data 
server process 

■ If tlie host macliine siuffen^ a hardw^are or softH'ai'e 
failure, loses power, or is manually turned off, 

Coxnxmmication channel faflures must be handled care- 
fully Jf the commmii cation cluuincl cx)nnecting a commu- 
nication object to its softwai*c> application fails, the data 



being mapped at the tin^e of failure must not be lost or 
duplicated. Also, after normal o]3eration of the conmiimi- 
cation channel is restored, conmimticalion between tlie 
communication object and its application must be auto- 
matically established again and all interrupted data trans- 
fers restarted. 

The following steps are taken to ensure data integrity 
when commmti cation cluuuiels fail: 

■ For data received from the som'ce software apphcation, 
the communication object never acknowledges receipt 
of the data imtil the data has safely been saved to a 
disk-resident receive-spool flie, 

■ Data received by the comniunication object from the 
SDtirce softvt'iiie application is rtot removed tiom the 
receive-spool file until the data has successfully passed 
through the mapping engine and been forwarded to the 
communication object responsible for sending it to die 
destination softwaie application. 

■ The communication object that sends data to the des- 
tination software application only notifies the mapjiing 
engine that it successfully received the data after tlie 
data has been salely saved to a disk-resident transmit- 
spool file- Also, it only removes data from the transmit- 
spool file when the destination softwai^e application has 
acknowledged successful rc^ceipt of tlie data. 
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Conclusion 



The HP Enteri^rise link prcKliict greatly n?duces the cost 
and eSon requrred to mterconoect business managpment 
systems (such as SAP's WS product) and me^isiireniem and 
control systems (such as Hewlett-Packard s RTAP/Plus 
product). HP Enterprise Link is an oflt'the-shelf product 
that allows users to connect software appHcailons using 
an easj'-to-ust^ point and click graphical i^iT interface. 

With HP Enterprise Link, companies can minimize the 
costs associated with changes made to computer systems 
ajid adapt more quickly to changing market conditions. 
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For more infonnation aboyt HP Enterprise Link, take a look 
at the information located at the followir^g URLs: 

■ http://wwwtmo.hp.com/tmo/piaA'antera/lntlex/ 
English/I ndex.html 

■ http://www,tmo.hp.com/tmo/piaA/antera/lndex/ 
Englisti/Products.litml 

■ http://wvvw.tJT] D.hp.confi/tmo/piaA/Bntera/lndfix/ 
English/EUnk,htm] 



HF~UX B • and tQ£ for HP 90Q0 S&i^ 700 and SOQ camputEts are K/Opef^ C^mpsny Um 93 

bmfnkd fm>dttcts 

UNIX in 3 regiswrsd trsiimsrk in nm United States and other countrm. HcBnsed exduSivefy 

tt^mugfj X/Opsn CmnpamLmted. 

K/DpBn fB 3 mgiswrsd tmdemark and the X device t*: 3 tradesftsrk QfX/Dp&n Company bmitsd 

in the UK and other cwtitfiss 

Microsoft isaUS r^isterBd trademark of Micfosoft CGfpormm. 

Windows isaUS rsgismr^d trademark of Mtcromff Corpofatton. 
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Knowledge Harvesting, Articulation, and 
Delivery 



Kemai A. Deltc 



Dominique Lahaix 



Harnessing expert knowledge and automating this knowledge to help solve 
problems have been the goals of researchers and software practitioners since 
the early days of artificial intelligence. A tool is described that offers a 
semiautomated way for software support personne! to use the vast knowledge 
and experience of experts to provide support to custonners. 



A 



consequence of iliv globaj aliifl. lowai'd networked desktops is \isibie 
in customer technical support centers. Support personnel are overwhelmed 
with telephone calts from customers who are experiencmg a steady increase in 
the nimiber of problems with intricate software products on various platforms. 

Support centers are staffed with less knowledgeable (and less experienced) 
fcst'line agents answering the simple questions and solving conmion problems. 
Expert (and more expensive) techiucians resolve more complex problems and 
execute troubleshootmg proceduies. The work of both (the first-line agents 
and the technicians) is supported by various teclmical tools, but they always 
have to use their brains and experience to handle effectively the stream of 
problems they encounter. This laiowdedge is seen as the key ingredient for the 
efficient fLmctioning of support centers. ^ 
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The number of calls and their complexity Jiave both liv 
creased. At the same time, support soludon efficiency has 
decreased as the cost for providing those solutions has 
increased. As a result, there is a need for a knowledge 
sharing soimion in wtiich the first-tint* agents will be able 
to solve ttie majority of problems and escalate to tlie tech- 
nicians only the contplex problems. To enable such a 
solution, we have to: 

■ Find efficient knowledge extraction methods 

■ Create compact, efficient knowledge representation 
models 

■ Use extracted l<iir>w1edge directly in the customer 
suppoit operation.s. 

Tills article describes the HP approach to pro\iding cus- 
tomer support in the Windows^'-Intel business segment 
This segment includes netW'Orked desktop einiromnents 
knoviTi for their high total cost of o\\Tiershlp. Holp-<iesk 
senices for this segment are supposetl to solv^c* the minor- 
ity of problems with software applicatioTis, local area net- 
works, and interconnections. 

The system describt^d liere. called WiseWare.^' is a knowf- 
ecjge harv^esting mid delivery systc?m si>ecifiC£dly designed 
to pro\iiie partially automated help for HP crustomer sup- 
pou centers in their problem solving chores. 

Partial automaton of help-desk support is seen as a suit- 
able, cost-effective solution tiiat will: 

■ Shorten the time spent per call 

■ Decrease the number of incoming calls (because of 
prQacilve iiKH^lianisnis) 

• Decrease the number of calls forwai'ded to the next 
support level 

■ Decrease the overall labor costs. 

The objective is to reduce dramatically the support costs 
per seat per year, 

Where Is Knowledge? 

To fmd the most efficient knowledge extract i(ju methods. 
we must first answer the qtiestion, "Wliere ts the knowl- 
edge?'* Hooks, teclmlcd aiticies, , journals, techmcal notes, 
reports, and prcxhui ckjcumeutatiou arc aU classical 
resources that rely on a human being's ability to extract, 

• WiaeWare is an mtBm\ rool ami not srr HP prodwrt. 



evaluate, and apply knowledge. Mechanized efforts still 
eatiH replace diese human attributes. 

Current support sokitions usually are b^ed on electronic 
coUeetions in a free-iext format, in which the important 
concepts are expressed using natural human langu^e- 
The latest release of WiseWare uses technical notes, fre- 
quently asked questions, help files, call log extracts, and 
tjser submissions as tlie primary raw material. Accoi^ding 
to the knowledge resource, different knowledge represen- 
tations and extraction methods are used. 

Kxtenshv research in the field of artificial intelligence has 
cit'aied several knowledge representation and extraction 
paradigms in which tl\e final use for knowledge determines 
the characteristics of the representation scheme, Tlie ear- 
liest knowledge extraction efforts, known iis htfoniiatiott 
retrieval, iuitiiiliy had sinaU mdiistrial impact. However, 
recent interest in the Internet and iit electronic book 
collections has revived the hiLsiness interesf. iii infotination 
retrieval. Some of tlie hottest procim^ts on the market loday 
are search engines. Different searcli methods (by key- 
words or by concepts) are being used and other search 
metlu^ds 0>y examples and by natiu'al language phrases) 
are being investigated. Recent synergy with ai1 iiicial Intel- 
tigence methods has created a promising subfield kno%vn 
as intelUgen! iidonnation retrieval.- Tlie nu\jonty of today's 
customer support solutions can be classLfied as enriched 
informHtiou retrieval systems. 

Elfictrontc Document Libraries 

Developments in the infonnation retrieval field have trans- 
formed free-text coUeGtions into more refinetl collections 
known as electronic docmtient libraries* Electronic docu- 
ment libraries have ati articulated structure (author, sub- 
ject, abstract, and keywords), enabtmg efficient searches 
and classification. They combine advanced teehnologictd 
methods (such as hypertext and multimedia) to fit users' 
infonnatitm retrieval needs. Some of the best support 
solutions today are in a digital libraiy class and represent 
sophisticated document management systems. 

Case-Based Retriaval 

Early hardware support docimientation contained trouble- 
shooting (iiagnuns that made it pos.sihle for ?^emce tech- 
nicians to troubleshooi ccjuipment coj\sisietitly by follow- 
ing the diagrams and perfoi ming the appropriate tests and 
meastirements. The recent re\ival of these diagrams is 
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Glossary 



C/wslef. Natural association of similar concepts, words, and 
things. 

Concept Group of words conveying semantic content. It can be 
described graphically as relationships between words having 

different attributes (and in some cases as numerical measure- 
ments of strength ), 

Data Mining, Gollective name for the field of research dealing 
with data analysis in large data depositories. It includes statis- 
tics, machine learning, clustering, classification, visualization, 
inductive learning, rule discovery, neural networks, Bayesian 
statistics, and Bayesian belief networks.. 

Infarmati&n Retrieval Identification of documents or infor- 
mation from the collection that is relevant for the particular 
information need. 

Keyword. Characteristic word that may enable efficient re- 
trieval of relevant documents. Two criteria used to assess the 
value of a keyword are the number of documents retrieved and 
the number of useful documents {recall and precision) 

Knowledge. Group of interrelated concepts used to describe 
a certain domain of interest. Complex structures formed by 
emulating human behavior In certain activities (for example, 
assessment, problem solving, diagnosing, reasoning, and in- 
ducing). Different schemes are used to enable knowledge 
representation such as rules, conceptual graphs, probability 



networks, and decision trees. Knowledge is found in large text 
collections and is biologically resident in human brains. 

Knowledge Map. Graphical display of interrelated concepts. 

Knowledge Base. Complex entity typically containing a 
database, application programs, search and retrieval engines, 
multimedia tools, expert system knowledge, question and 
answer systems, decision trees, case databases, probability 
models, causal models, and other resources, 

Methcs. Group of measurement mettiods and techniques 
introduced to enable quantification of processes, tools, and 
products 

fi/alaral Langiiage Processing. Activity related to concept 
extraction from, formalization of, and methods deployment in 

a problem area. 

Paradigm A theoretical framework of a discipline within 
which theories, generalizations, and supporting experiments 
are formulated, 

Problem Domain, Area of interest defined by terminology, 

concepts, and related knowledge. 

Search. Activity guided by a find and match cycle in which a 
search space is usually explored with an appropriate choice of 
search words (keywords). Advanced search is done by concepts. 



sc^en in interactive tioiibleshootiiig systems tliat eitable PC 
hardware tecliniciaiis to solve hardw^ue probioivLs. So far, 
such systems are implemented as rase-ba.sfd rcl rieval (or 
reasoning) systems. The ni^odty of these systems provide 
only retjieval; Just a few includp I he reasoning compoiient- 
Tlie case-based retrievaJ [>a2 adigm is based on the him^an 
ability to solvti problems by remembering previously 
solved problems. The support system plays the role of an 
eleclroiiic case database in which the knowledge consists 
of [locimiented experience (cases). Creation and inainte- 
naiice of the cases is an expensive and nontrivial process. 
Currently, these acti%ities are pei-formed by himians and 
are used maiidy for hardware suppoit. Such systems 
camiot deal efTiciently witli lai'ge, complex, and dynamic 
problem areas. 



Rule-Based Systems 

Some siippf jrt centers have tried to use e^cpert systerns 
based on niles, but they liave discovered that the rule- 
based systems are difficult to create, maintain, and 
keep consistent. Crafting a collection of rules is a com- 
plex chore. It is not clear if this technology vAW havt? a 
role in futm'e knowledge representation and extiaction 
development. 

Mo del -Based Svstems 

A model-based paradigm in wMch various statistical, 
causal, prcjbability, and behavioral iiKjdels aie used is 
anotlier example (jf knowledge representation for cus- 
tomer supjjon. The knowknige here is expressed by the 
fault/faihu'e model that trontains quantified reiatioiiships 
between causes, syniptomSj and consequencres. Basic 
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decision making is enabled wiih such models. Altliough 
some limited experiments vnth this highly sophisticated 
knovriedge represeniaticm paradigm have been done, no 
system is in operaUonat iise in support centers. 

New Research 

The newest research in the 6eld of data mining and know- 
ledge discovery^ may offer some potentially effective 
knowledge representation methods for deployment in 
customer siippoit centei's. This research aims at the 
extraction of previously unknowii patterns (insights) from 
tlie existing data repositories. Research in ariificial intelli- 
gence has identified the initial assembly of a low-cost 
knowledge base as a potential ^^engineering bottleneck." 
The knowledge authoring environment discussion later 
in this iuticle addresses that issue. Because most of tlie 
knowledge for WiseWai'e comes from text sources, we 
win focus oiu' attention here on the laiowledge extraction 
process. 

WiseWare and Knowledge Refinement 

Knowledge is a fluid, hard-to-define but essential ingre- 
ciient for all human intellectual activities. It is difficult to 
extract, articulate, and depl<:)y The prevailuig quantity of 
knowledge is encoded in tiie tbrm oi' text (90 percent) 
expressed in natural language and is ariiculaled iis a wel) 
of interrelated concepts, A goal of researcii in natural lan- 
guage Ls to enal>le automatic ;md semiauttjmfJtic extrac- 
tion of knowledge. Content analysis must be automated to 
efficiently provide suggestions and solutions for users. As 
we have already seen, several knowledge representation 
paradigms are being invented and investigated (for 
example, semantic nets, rules, cases, ailtl tfecisicm trees). 
Additionaliy, we can depioy various techniques to extract 
concepts (symbohc knowledge) and nimierical tjuantities 
(numerical and statistical knowledge). 

Refinemenl Process 

Himtan expert:S tLse spreadsheets, outline processors, and 
some vendor-specitlc tools to refine source text, but have 
not yet developed systematic, efficient processing methods. 
In the future, we would like to automate some phases of 
this procesSj leading toward more efllcienl and effective 
deploymei\t. 

Knowledge refinement is seen as a process for converting 
raw text into coherent compact, and effective knowledge 
forms suitable for software problem solving and assistance 



(for example, decision trees, rules, probabUity models, 
and semantic nets). The basic raw material (the knowledge 
in its priniar>' form) remains accessible. Tliis preserves 
pre\iotis investments in knowledge and enables integra- 
rion into future, more sophisticated solutions* 

We can describe the knowledge refinement process as 

efforts tnade to transfomi raw text to a compact represent 
tation and tiien to operational knowledge. Associated 
costs increase as raw text moves through the refinement 
process to become operationaj knowledge. 

Cunently WiseWare content is partitioned into three con- 
ceptual categories: fixes, step notes, and techiiical notes. 
The fu'st two contain shaHow, specific knowledge and the 
third contains complex techmcal concejiiLs. A fix is a sun- 
pie, shoit document that describes A\ith fewer tlum 100 
words a known and recniring problem with a knoHii 
solution, the fix (see Figure la). A fix often helps the 
customer out of the immediate problem hut does not pro- 
vide a long-term solution. It is essentially a "quick fix/ 

A step note usually walks the user through a procedtire 
that prevents the problem from occumng in the fuim'o 
(see Figure lb). The step note requires more of the user's 
time to solve the immediate problem than the fix does, 
tjut it saves time in tlie future. 

Bf)th fixes and step notes offer additional references. 
Thtjse references contain kej^^ords providing links to 
teclmical notes that explain the most relevant related 
subjects in detJih. Technical notes ret ju ire deep technical 
kno%\1edge to be protjerly understood and appbed. 

The whole collection of fixeSy step notes, and technical 
notes is tagged to associate the content of each document 
with the proper problem classes. Consequently, WiseWare 
content is perceived by the user as a repository' of atlvice 
and solutions for given problems (quick fixes, step-by-step 
proccdtues, mid teclmical theory). 

Some generic activities in the refinement process can be 
denoted as: 

• Assessment 

■ Extraction 

■ Filtering 

■ Stmmiarization 

■ Clustering 

■ (■lassification* 
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Figiu'e i 

Two Wise Ware screens: (a) WiseWare fix screen, lb} Wise Ware step note screen. 
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We can describe the evolution of WiseWare as going from 

answeojig questions to giving advice and fmall3' to problem 
sohing and iroubleshooting- Tlie support costs in Uiis 
e\-olution liave escalated as the problems have become 
more complex. 

KftowlGcfgB Authorriig Environmeiit 

Since a critical mass of knowledge can be reached only 
if multiple authors contribute fo the knowledge base, tlie 
knowledge authoring environment nmst be able to deal 
with niultiauthor issues effectively. Additionally^ because 
t tie knowledge authoring environment is deployed on a 
vvorkhvide basis, the issue of different languages is rele- 
vant as well Finally, deployment in different time zones 
requires veiy high reliability and availability of the knowf- 
edge aiitlioring environment. 

The qiialit>^ of the knowledge is constantly monitored and 
refined. Arccis Ibr improvetnent ai'c pinpoiiUed by iuialyz- 
ing results reported on the knowledge l>ase logs. As weak 
points are identified and strengthened, better system 
t^eifomiance wiD help to optimize return on tnv estment 
figures. Even user satisfat:tion t:aji be assessed from the 
vai'ious logs and usage traces thai will reflect a c:ombined 
measure of system quality and usefvdness. 

Futiu'e worldwide cfjo[)(*t arioiT among support renters 
to shai'e knowledge is our objective. Ide^dly, each <'enter 
will deploy and create the necessary knowledge locally. 
C'enters operate in different time zones, have different 
cultural and social contexts, and have the abilily !o manip- 
ulate huge amounts of data, inlorniation, and kiKiwletige. 
Coordinating the knowledge bases for all support centers 
pose several chalk^tiging t>roblems. The complextty of 
these problems is reduced [)y careful enghieering and 
incremental deployment. The result is a iow-cost, knowl- 
edge-based support, adding new value to the support 
business. 

In a very advanced sif nation, and fifmi a Inng-tenn i>er- 
sjjective, extracted knowk^dge will become the crucial 
ingredient for the next development phase. In ttils phase, 
human mediation in problem solving could be removed. 
Support (THjld be delivered electronically without human 
inteivent ion, For exajni>le, iniagiJie intelligent agetits trav- 
eling over the network to the troubled system to fix 
a problem. * ('nrrcnt vini.st*s on the htternet are doing 
exactly tlie opposite task, \V\m\ if the trend were n*versttd? 



Support knowledge could be adapted so that healing vi- 
ruses could trav^el through a system. delK'ering remote 
fixes. To miderstanti ho\\ this (*ould become a reality, let's 
review the liistoiy of WiseWare. 

WiseWare Arehitectiire 

In November of 1995, the first challenge Wiis posed to the 
WiseWare team when the French call center decided to 
outsource low-end software support services. Their sup- 
port personnel were without computer technology^ back- 
ground and demonstrated i^oor English language skills. 
The knowledge department in HP's Software Services 
Division in Europe responded to the challenge and deliv- 
ered the first operational WiseWaie .solution ui April of 
1996. Since tiien, new releases are issued every two 
months with steady improvements. 

In the WiseWare release 4 J, mirroring intranet servers 
(Euroj>e and tlie United States) cover tlux^e super regions. 
The number and quality of accessible documents is 
constantly improved, while use of the system is closely 
n^onitored from access mid search IfJgs. We have estab- 
lished close links with software ventioi's %\"ho idlow us 
privileged access to their dociunents. (The legaJ frame- 
work for cooperation and alliances is defmed as well.) All 
activities and senices undergo quality assurance scrutiny 
in i)repciration for ISO-9000 ceitiflcation, 

WiseWare provides approximately BO,iM) documents to 13 
call centers worldwide. The average problem resolution 
assistance rate is over 30 percent. More than 40 products 
are covered in the vari*:His t>i)es of docnments offering 
quick tlxes for agents and in-depth teclmicat knowledge 
for advanced WiseWare users. 

WiseWare is a distributed system with three msyor parts: 
jjroduction, publishing, and monitoring (see Figure 2). 
Tht»y art* implementeci on UNIX " aiui Windows NT plat- 
lonus, with intranet teclmology providu^g the necessary 
glue for client/server solutions. It is a nonstop, highly 
available .system. The key advantage of t lie WiseWare 
system lies in the tight loop between the monitoring and 
production areas in which the principal objective is to 
provide users with higlily adaptable doctmtents for evety- 
djiy prrjblem-solving (innes. Data mining and natural 
langnage processing moduies dynaniirally create user, 
problem, and document profiles tJiat will then drive the 
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Figure 2 
WiseWare system an 


zhitecture. 
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production side, enabling technical and business insights 
to he gleaned from laige and extensive access and seai'ch 
logs. 

At this time, customers call the express hubs and exi^lain 
their problems to suppoit peisonnel using natiu'al lan- 
guage constnicts that sometimes blur the teal natioe o! 
the problem. According to their understanding, support 
personnel create and launch a search phrase. It is a 
Boolean constiiict containing relevant keywords or free- 
text phrases that roughly represent the problent. Different 
search, hit, and presentation stiategies are currently used, 
but foniiation of the effective search qiLei^^ and reductlai> 
of the number of relevant rejDlies are largely still urue- 
solved. A mixture of aititicial intelligence teclmiques ^uid 
traditional infonnation retrieval and database methods is 
being offered as potential solutions. 

Table I shows \\o\^' one. tw'o. and tliree words in a typical 
seaich phrase can influence the niunber of relevant docu- 
ments returned with cuiTent version of WiseWare. A well- 
foiTued pliiase helps to quickly pinpoint relevant docu- 
ments wMle retaining necessaiy coverage of the probleni 
area. Notice tlie L[iut k tleerease in the number of relevant 
docmnents retunied as tlie phrase becomes longer 



Support center pei'sonnel w^ork mider time-press LU'ed, 
slresslYil cii'CLUUst^ices. As a result, the whole luunau- 
computer interaction issue nuist be t^arelully considered. 
Efficiently dehvering advice and prf)b]em-soKing assis- 
tance can depend on the smallest detail. Besides the 
quality of the material in the su^jporting knc:iwiedge b^ise. 
questions regarding query formulation and presentation of 
the retrieved infonnation will influence fina! acceptance 
from the users. Suppoil activities can be treated as sym- 
biotic human-maclune problem solving in a bidirectional 
learning paradigm, Tlie user learns how to manipulate the 
system (facilitated by language feattires such as localiza- 
tion and ([ueiy wizards). At the same time, the system 
adapts to the user*s metliods of accessmg the knowledge 
base, Tlie l^lseWare system learns usei^ behavior from 
access and language patterns, hiteraction with the system 
customizes the environment to suii the specific user's 
profile. The reasoning activity is still done by humans and 
is supported by refined electronir collections. Good syn- 
ergy and efficient functioning t>f such liuman -computer 
systems are the current objectives. 

Because the sitppoit centers are located in dLfferent geo- 
graphical cidtural, and language areas, the natiu'al hui- 
guage layer is seen as crucial Ibr search and presentation. 
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Table r 














Search phiJises mid manber of documents rehtrfted 










AppKication/Doctinierrts 






Search Phrases/Documents Returired 






Change 


Cliafige+U$er+ 
Change+User Password 


Create 


Create+New 


Create+New+ 
Message 


WiiiNT/6046 


1371 


y 


^ 


- 


- 


- 


Uiii3.x/4397 


1224 


1 





- 


- 


- 


CC:Mail/2600 


- 


- 


- 


727 


12 


2 


MS:MaiV3963 


- 


- 


- 


878 


21 


2 



Technological advances in visual search and de livery* 
combined with audio and video techniques may improve 
tlie Quaht^' and efficiency of the system. Better arc hit ec- 
tuie combined with object-oriented (midtimedia) data- 
bases win add anotlier dimension to tiie delivery phase. 
These improvements wiU be made over tune and ^Hl be 
accelerated by terimological developments h\ related 
fields. 



Conclusion 



Accessible knowledge Ls the essential iiigredienl for suc- 
cessfully deding with tiie rising tiuanLity and compiexity 
of customer support. caJls, A semi automated system with 
refnied knowledge in reusable fomis can enable usei's to 
shaie knowledge among different, geographically dis- 
persed customer support centers. The overall objective 
of HP s Wise Ware server is to provide low-cost, effective 
customer suj)j)oi1 , This is a simple objective but one tliat 
is difficult to aciiieve, espec:ialiy wlieii signilk ant effort 
and mvestment are required to achieve technological 
breaktlu^oughs in the problem-solving field;'^ 

In the short tenii, incrcMnental clepk)ympnt (jf advanced 
methods such as data mining and natural kuiguage jnxh 
cessing tectmiques will improve system quality <i3id usage. 
In the long run, it is very likely that most of tlie client-hub 
telephone voice communication will be gradually replaced 



by computer-computer communication* Several layens of 
the present problem- solving architecture T^ill disappear 
or will be replaced by some new elements. Tlie problem- 
so King knowk^dge along with seaicii and access log 
collections being developed ncjw^ will seive as the fmida- 
mental basis for future electromc support 
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A Theoretical Derivation of Relationships 
between Forecast Errors 



Jerry Z.Shan 



This paper studies errors in forecasting the demand for a component used by 
several products. Because data for the component demand [both actual demand 
and forecast demand) at the aggregate product level is easier to obtain than at 
the individual product level, the study focuses on the theoretical relationships 
between forecast errors at these two levels. 
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' itli a soLiiKil t heoretlc:at romidaLioii lor midfc?i"s tan ding forecast errorSj a 
much nioie c'onfident job can be done in forecasting aiid in related planning 
work, even mider uncertain ljusiru>ss conditions.^ 

In a typical material pianrditg process, pJannei^ axe constantly challenged hy 
forecast inaccuracies or errors. For example, should a component Ibi ecast 
error be measured for each platform for which it may be needed, or should its 
forecast accuracy be measured at tlie aggregate level, across plaUbrms? What 
is the relation between the two acrc:oracy me^isures? 

Tills paper describes a theoretical study of forecast errors. First, we fomially 
define fbref:ast errors with different rationales, derive several relationships 
among them, and prove a heuristic formula proposed by Maik Sower. ^ Then 
we study tJie effects of a systematic bias on the forecast errors. Finally, we 
extend our study to the situations wherc^ coric^lations across product demands 
m\d time effects in demand and forecast are taken into accoiuit Definitions 
and theorems ai'e presented first, and proofs of tlte theorems are given at the 
end of the paper. 

Basic Concepts 

Consider the case of a component that c;ui be used for the manufactm-e of n 
different products, or platfonns. For platfonn i ( 1 < i <ii), denote by l\ the 
forecast demand for the component, and by Dj the actual demand, hi the 
treatment of forecast ami actual, we propose in this paper tlie following 
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{rame^^'OTk: Regat^ fo^^ecast de^tmnd as delennmhtic, or 
predeiemiined, ami actual demand as stoehasdc. By 
stochastic, we mean that ghen the same operaiing en\i- 
roimient or experimental conditions, tiie actual demand 
can be differ^it firom one operation run to another. Thus* 
we can postulate a probability' distribution for it. 

For a generic case, denote by D the actual demand and by 
F the forecast. We call the forecast uHbiased if E(D) = K 
where E(D) denotes the expectation, or expecteti value, 
of D vtiUi respect to its probability distribution. Practically 
speaking, this unbiased requirenient means that over many 
runs uncier the same operating conditions, the average of 
the realized demand is the same as the forecast. If there is 
a deterniuiistic quantity b ^ such that E(D) - F + b, then 
we say the forecast is biased, and tlie l>ias is b. hi prac- 
tice, this means that there is a systematic departure of the 
average realized demand from the forecast. 

Tliioughout the paper, we often make the norniahty as- 
sioiiptitjii on the demand, that is, for unbiased forcctists, 
we assume ttiai the demand D hiis a noimal (Gaussian) 
distribution with mean F and standard deflation a, that is, 
D~N(F, 0^). Is this a reasonable assumption in reality? 
The an.swer is yes. First of till, this assumption is techni- 
cally equivalent to assunung tliat the ditTerence e = D — F 
between the actual demand D and the forecast F is nor- 
mally distributed: e — N((), a\ The valithty of this tatter 
assmaptlon is b^Lseii on the fad th;il the thlTereuce be- 
tween the actual demand and a gtxjd ftirec^ast is some ag- 
gregation of many small random errors, and on the central 
limit ttuH>rom. w^hich states that the aggregatirjti of many 
sniall random errors has a iimitijig noniial distrit>ution. 

Unbiased Fare cast Case 

III 111 is section, we assume luibitised forecasting at all 
platforms.^ Statistically, E(D,) = Fj, where F, is the fore- 
cast for the common component at platform i, and Di is 
the actual demand of the component at platform i. 

Definition 1: (Same VVtnglii M(*mi E^ascnl ) Define F^rp-Eftr-^) 
to be Llie forecast etrrjr at the mean (average) platform 
level, and E^ == E(e J to be the forecast enxir at the aggre- 
gate platfonn levels where: 



and 



^Ji " n Z^ p. 

i = 1 ' 



(laj 



^■A = 



n 

V 


Dj- 


n 








ii=i 




j=i 




tv 






\' 


f, 










i = l 





(lb) 



The rationale of defining the forecast error at the 
mean level and at the aggregate level is as follows. Let 
Ei = [D| — Fjl/Fi. Then e, measures, in terms of the relati%'e 
difference, the forecast error at a single platfonn i. 
Accordingly. Ej, measures the forecast en*or, also in terms 
of the relative difference, at die aggregate le\'el from all 
platfomis, aiui ^^ provides an estimate for Uie forecast 
error at any mdividual platform smce h is the average of 
the forecast eiTors over aU intli\idual platfomis. Because 
all the quantities in equation 1 are stochastic, we take 
expectations to get tlieir detenninistic means. Now, a 
natural question is: What is the relation between the 
errors at the two different levels? 

Theorem 1 : Based on definition 1 . and assLuuing that 
D, -- N(Fi, u^), i = It 2. ..,, n, anxl that the I), ai'e tmcoiTt*- 
lated (strictly speaking, we also need the jomt normality 
assumption, which in general can be satisfied), we have: 



L EIjt=: ^nE^CiLi where: 



^-^?,^)(^?/'} 



(2) 



2. It is always One that C|, ^ 1, and Cn = I if and only if the 
forecasts across all the platforms are the same. 

We note that in the definition for ^^, we used the same 
weight, 1/n, for all platforms. If instead we use a weight 
pro|K>rtional to the forecast at the plat fornix then w^e have 
the following: 

Definition 2: (Weiglited Mean Based) Define E^ = ECe^} 
ajid Ea = E(eaJ^ where: 






i=l 



F, 



il^i-I'^ 



i=l 



i 



.i = i 



Z". 

i=i 



t3a) 



mid 
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Jj It 



i-l 



i=l 






C3b) 



1=1 



Theorem 2: Based on definition 2 and with the same 
aissLiniptions as in theorem 1. we have: 

En = vIlEa. 



(4) 



Mark Sower^ proposed this heuristic fomiula. Theorem 2 
says that under suitable conditions, t^quafinn 4 holds 
exactly. 

Other researchers iiave addressed a similar problem from 
the perspective of demand variability. In measuring the 
lelative eiTors of the forecast ai the iiichvidual platforms, 

it was assumed that o/^i (i - 1, 2 n) are the same, 

where o, is a measure of demaiic! varial^ihty aiid ^i is tiie 
mean demand at platform i. The advantage here is we do 
not need to m^ike such a strong assumption. In fact, am^ 
measure of the foret:ast error at the individual i)latfomi 
level can be interpreted as the forecast error at an aver- 
aged in dividual platform. 

The following definition f)f error Ls based on this observa- 
tit>n in practice. The standaJ d de\ialinn of a nindom vari- 
able c:an be very largt^ if the values this random vaiiable 
takes on are very Icuge. A more sensible error measure of 
such a random variat)le wxndd he the relative error rather 
than the absolute error, So. given a random variable X, 
we can measure its error by the coefficient of variance 
cv(X) = a(X)/ECX) rather than by its standard deviation 

With the unbitisetl Ibrecast assiuuption, 1 he foreciist error 
at plaLfoiin i ckui be nieaisuied by c'v(Di ). The average of 
these coefficients over all platforms is a good measure of 

the forecast error at the indi\iduai platform level On the 

n 
other tiand, V D^ is the demand from all platforms, and 



i^i 



^ F- Ls the corresponding forecast, so cv 



(I-) 



IS a 



good measure of the forecast error at the aggi^egated plat- 
form level 



Definition 3: (CV Based) Define: 

E,-, = 2 cvPi}/n and E,^ = evi ^ D- 



i-I 



M = l 



Theorem 3: Biised on definition 3, iind assmning that the 
Dj are uncorrelated, w^e liave 



Ejt — v'nEaCn 



C5) 



where Cn is defined in equation 2. For theorem 3, we do 
not have to assume normality to get the relevant results. 
This is also true for theorem 4, 

General Case: The Effect of Bias 

We assume here that forecasts are consistently biased. 
This is expressed as E(Di) = Fj + b, where h denotes the 
common forec^^st bias. This indicates that F[ overesti- 
mates demand when b < and imderestunates demand 
when b > 0. 

Can we extend the use of defmition 3 for the forecast 
errors to this general case? The answer is no. Tins is be- 
cause the standard de\iation is independent of bias, and 
therefore one could erroneously conclude that the fore- 
cast error is small w^hen the standartl deviation is small, 
even though the bias b is very significant. Instead, llie 
forecast error now should be measin'ed by the functional: 



e(D,F)^ ,/E(tD-F|^)/F. 



(6) 



rather tlian by tiie cv, w^hich is v' E([D - E(D)]^) /E(D). 
Hence, in parallel with definition 3, we ha\^e the following 
delinition. 

Definition 4: (e-f^mctional Based) Define; 

E, = J f^ D^, J F^ I and E;, = f^ e(D^.F.^/n, 
\i=l 1=1 / i=l 

where the functional e is defined in equation 6. 

If the bias b - 0, then the ftmctional e in equation 6 is 
the same as the cv. and hence definitions 3 aiid 4 tu'e 
equivalent. 
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Theorem 4: Based on definition 4, and assuming that 
Dj - (Tj -\- h, o^),* i ^ 1, 2, ..., n and that the Di are nncor- 
related* we then have: 



E;t = vnEaC 






nb^' 



(7) 



where Ctv is given in equation 2. 

Since definition 1 considers the relativ^c difference be- 
tween the forecasi axid the actual, any bias in the forecast 
will be retained in the difference^ so there is no problem 
in using this definition even if there is biiis. However, the 
relation between the two errors has changed. 

Theorern 5; Based on definition 1 and the assumption thai 
Di - N(Fj+b, o^), i = 1, 2j .,., n and that the Di are uneor- 
related, we have: 

_ ,-„ „ y|oe-b^/2o^ + b[24>(b/a3 ^ 1] 
iLiji — vnLaLn^ ^ -^ . \p) 

4%QB-^^'^^ + ,nb[24>(,nb/o) - 1] 

where C,^ is defined in equation 2, and 0{x) is the cunvuia- 
tive distribution function of tit e standard nonnaj distiibn- 
tionN(0, l)atx. 

If there is no bias in tlie forecasting, the relationships be- 
tween the errors at the two levels are exactly the same for 

definitions 1 and 3: Ej^ — /nEaCn. Tills formula, with the 
introduction of tiie constant i.\, is slightly different from 
the hypothesized equation 4. As noted in theorem 1 . it is 
always tnie that Cn> 1. If we use tlefinition 2, thcji equa- 
tion 4 holds exactly. 

If there is bias in liie forecasfing, then in each relationship 
formula (equation 7 or equation 8), there is another multi- 
plying factor that reflects the effect of the bkis. One can 
easOy find that both of these multiplying factors Lu*e less 
than or equal to 1. Tliis inipLiey that, compared to the 
error at the component level, the error at the platform- 
component level wltcn forecast bias exista is less than 
when the forecasi bias floes no! exist. 

If bias does exist, as it does in reality^ it seems that the 
multiplying factor resulting from bias in eitlier equation 7 
or equation 8 should be taken into consideration, with 
suitable estimation of the parameters involved- 

' The notation X— itt, o^) means that X has mean ^ and standard deviatiDn uhut is not 
neeessanly normally distributed, 



Correlated Demands 

It is reasonable to assume that demand for a component 
for one platform affects demand for this cooiponent for 
another platform, .\lso. for a given platform, there \^ usu- 
ally a strong correlation between the current demand and 
ihe iiislorical demaiids. The forecast is usually made 
based on tlie historical demands* In this section, we first 
propose a correlated multivariate nonnal distribution 
model for the demand streajn when die platform is 
indexed, and then propose a time-series model for the 
demand and forecast streams when time is indexed. Oui' 
goal is to expand our study of the relationship between 
the two layers of forecast errors in the presence of cor- 
relations- Thicmghotit tliis section, we assume unbiased 
forecasts, and use the weighted average definition (defini- 
tion 3) for the forecast error 

Correlated Normal Distributiork IVIodel bX a Tiine Point, hi 

tills sub.section we consider die case wi\ere there is cor- 
relation across platform demands, but we still assume 
that time does not affect demand. Suppose that the de- 
mand stream Di, i= 1, 2 n can be modeled by a corre- 
lated nonnal tlLsTribution such that Dj — N(F[, o^) for i = 1, 
2j ..., n imd that there is a coiTclation between different Dj 
expressed as Cov(Di, Dj) - o^pjj for 1 < ip^j < n. With this 
asstmiption on the demand stream, we have tiie foUowmg 
result,. 

Theorem 6; Oasetl on definition 2 and the above corre- 
lated nonnal distribution modeling for the demand 
stream, we have: 




E^ = 



In particular, if py = p for all 1 < i ;^i < n, then we get: 



m 



E:t = 



vn 



,/(n - Dp + 1 



(10) 



When the conunon con elation coefficient p is {) or near 0. 
we see iliat eriuation 4 liolds exactly or approximately. 

Auto regressive Time Series Model. Now we take into 
consideration the time effect in the product demand. For 

platform i, i = 1, 2, ..., n at time t, t == 1, 2, ..., denote by Dp^ 



O 
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the demand and Fp^the forecast. Suppose that the de- 
mand stream over time at each platForm can be modeled 
by an autoregressive model AR(p). At platfomi i, the auto- 
resgressive model assmnes tliat the demand at the cmront 
time t is a linpar [uiivilon of the pa^^t demands plus a ran- 
dom dislQrt>anee^ that Ls: 

Df^^|;a,p[«-iUBp> 

where the ajj are constant coefficients. Fmther, suppose 
diat the forecast F^ is optima] given the historical de- 
mand profile =Fp-** = o{p\^K Dp, .... Dft^^^Whatis, 

with DJ'^J, D\^^K ..., r}[-fJ^-^^J properly initialized, for 
t>l; 

and 

pit) ^ e(dP|?P^^^) 

= aj^jD[^-^Ua-pf-^' + 
2] 



+ ^,^^-^^^Ef\ 



+ aij,Dp-P>, 



e ! " ■ , ... are ind ep end ently and id en- 



where fP^ 

tically distributed as N(0, a^) and the random disturbance 

at time t, that is, bP\ is independent of tlie demand stream 

before time t, that is, jD[' ~^\ B[^~'^\ ...)- Also, we 

^issume md op en deuce across plattbniis. WiJh the above 
modeling of the demmid and forecast, what ck\n we say 
about the relationsMp between the two layers of forecast 
errors? 

Theorem 7: Based on definition 1 or 2 and the above dme- 
series Jiiodeling for the demand streiun and forecast 
stream, and assmnuig liiiit the viiritmccrs at all platfomis 
are the same, then at any time point, if definition I is 
used: 



where 



(11) 



M^2, 



i=i' t 



c„ = 



'^ ^ 


n 


n 

Vf?u 


i-I 



and if definition 2 is used, then- 
El^" = ,nE(^^l 
Rewriting d, in equation 2 as 



(12) 



Cn = 



II 

n Z. ir(i) 



1=1 



and taking expectations for tlie numerator and denomina- 
tor separately in the expression leads to Cn- Hence, it is 
always true that Cn ^ L 

Proofs 

Theorem 1 is a special case of theorem 5. Theorem 3 is a 
special case of theorem 4, The proof for theorem 6 is simi- 
lar to that for theorein 5, with an application of lemma 1. 

Lemma 1: If X-N(b, O"), then: 

EIX[ = y|oe-^^^2^^ + b[2cl^(b/a) - 11 = H(b,o). (13) 

Proof of Lomma 1 : Withoui loss of generality, we can 
assume that o = 1, since otherwise we {::an make a simple 
transfoiTtiaticm Y = X/a. 



EIXI 



v2ji J 



Ixle-f^-t^'Z-dx 



y2n) 



lxle-'='-'^>'/-dx + 




at 

v2jr J 
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1(b) 



= 1(b) + I{ - b), where 



3C 



X 

v2ji J 



+ b)e^y"^%- 



= ie"^^^ + b4>(b), and hence 
.2n 



EIXI - ^=e-^'^- + b^flj) + (" b)<I>(- b) 
v2ji 



Proof of Theorem 1 Parts 2 and 3. First note that func- 
tion (|)(x) - 1/x is cx)nvex over (0, ^ ). I^et landoni variable 
X liave a unifomi fbHiribution on the set iF^: 1 ^i^ii|, that 
is, P(X = F0 = l/n. An application of the Jensen iJieqiiality^ 
E(p(X) ^ (p(EX) leads to tlie desired inequality. The second 
part is based on the condition for the Jensen inequality to 
l>e('nme an equality. 

Proof of Theorem 4: 



e(Di,Fj) = 



^ECIDi - Fj]2) y-F7^ 



Fi 



Fi 



E.-, = iVe(Di,F,) = l£^ 



1 V v'oi! + b2 



i = J 



i=l 






i=l 



Hence we have; 



En _ - /^Tbip 

Proof of Theorem 5: Noting that: 

n 

D, - Fi ~ Nfb. 0-) and Y (D, - Fj) - N(nb, no^), 



i-i 



Eg^E(e^ 

Ea E(ea) 






' -:==r— (by lemrtm 1) 



1 H(nb, ,no) 






= .nCn — 



yioe-*^/'^ + b[20(b/a) - 1] 



v'|ae-^^^ + ,iib[2(t*(,nb/(0 - 1] 

Proof of Theorem 7: Tile proofs for equations 1 1 and 12 
iire siniiliir. \\v give* a proof for equation 1 1 only^ First 

notice that D|** - Ff^ ^ ep~NCO, Oi^ At any grt^en 
time t, by the definitions for E^^^ and E^^K we have: 



j|m4"i)E(ife) 



'^l/Hi 



i^l 



pd) 



This second step follows from the fact that ePMs inde- 

jK'ndent of demands before time t, and hence independent 

of tfie optimal forecast at Lime t, F!'^ The last step follows 

front lemma 1 and the same variance assumption across 
platfonns. 



E(t} = E 



= E 



t hen we have: 









)■ 


' 1 ' 




1=1 
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1 = 1 



1 





" /2 J. 
= ,na J-ah 



'^ 


1 


n 

j - ] 





The re^^oning is the same as for proving E^^ above. 



Conclusion 



Forecast errors increase the complexity and difficulty of 
the production planning process. This results in excessive 
invcntor^^ costs and reduces on-time de]iver>\ !n tliis paper 
we ha%'e studied the IbreciLst eiTors for the case of several 
products using the same component. Because data for the 
component domiiUid (both actual demand antl forecast 
demand) is easier to ol:)tain at the aggregate produc:t level 
than at the indi\idual product level, we fociLsed on the 
theoretical relationships between forecast errors at these 
two levels, 

Om^ first task was to propose formal definitions for mea- 
surhig forecast enors under chffereni ration^iies and tech- 
nical assumptions. The second task was to formally derive 



relationships between forecast errors at the two levels. As 
part of our work we proved the validity of a heuristic for- 
mula pi'oposed by Mark Sowt^r oftlie business ojjerations 
plaiming departnumt at the FIP Roseville, California site. 

In addition to analyzing the twolevel problem, we derived 
a theoretical basis for relaxing the usual assumptions con- 
cenioig con'clatioJis in tlie data across ijrodiicts antl over 
time. 
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strengthening Software Quality Assurance 



Mulsiihiko Asadd 



Pong Mang Van 



Incma^inn timR-tn-markfiT DrR5?5jnrR!^ in recent vears have re^^ulted in a 
Geiertoraiion or ine quaiiiv' or soravare enienng me sysiem lesi pnase mi 
HPs Kobe Instrument Division, the software quality assurance process was 
reengineered to ensure that released software is as defect-free as possible. 
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*he Hewlett-Packard Kobe Instrument Division (KID) develops 
measurement instruments. Our main products are LCR nietere and network, 
spectrum, and impedance analyzers. Most of our software is built into these 
instruments as firmware. Our usual development language is C. Figure 1 
shows our typic^ development process. 

Given adequate development tinTe, we are able to include sufficient software 
quality assurance activities (such as unit test, system test, and so on) to pro\dde 
high-quality software to the marketplace. Howeverp several years ago, time4o- 
market pressure began to increase and is now veiy strong* There is no longer 
enougli developmem time for our conventional process. In this article, we 
describe our perceived problems, analyze the catises, descrUie coimtermensures 
that we have adopted, and present the results of our changes. 



Figure 1 

Hewl&tt-Packard Kobe Instrument Division software development process 
before improvement 
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Existing Oeveroprrient Process 

The software* di^vplopment process that we have had in 
place since 1986 includes die following elements: 

■ tniprovemenl in the design phase. We use structured 
design methods such as modular decomposition, we use 
defined coding conventions, and we perform design 
renews for each softwai'c^ modnle. 

■ Product series strategy: The concept of the product 
series is shown in Figure 2. First, we develop a plat- 
form product that consists of newly developed digits 
hardw^are and software. We pnidently design i he plat- 
form to facililale efficient development of the next and 
succeeding products. We then develop extension prod- 
ucts that reuse the digital hardware and software of the 
platform product. Incre^iising the reuse rate of the soft- 
waic? in this way conti'ibntc^s to high software quality. 

■ Monitoring the defect ciuve. The defect cmve is a plot 
of the cumulative number of defects versus testing time 
(Figore S). We monitor tliis cmv'e from the begmning 
of system test and make the decision to exit from 

the system test phase when the curve shows suflicient 
cc^nvergeiice. 

As a result of the above activities, our products' defect 
density f the miaiber of defects witliin one year after sMp- 
ment per tht>usand nonconuneni souice statement sj liad 
been decreasing. In one product, less tlian five defects 
were discovered in customer use. 

Perceived Problems 

Strong time-to-market pressure, mainly from consumers 
and competitors, has made oiu' de^'Cloii^ment period and 
the inteival between projects shorter. As a result, we 
have recogi\ized two signilicant problems in om' products 



Figiue 2 

The product series concept increases the software reuse 
rate, thereby increasing software qusiity. 
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■Figure 3 

Typical defect curves. 
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and process: a deterioration of software quality aj^d an 
increase m maintenance and enhancement costs. 

Deter iarat ton of software quality. In recent years (1995 
lo 1997), software quahty lias app^ueiitly heen deteriorat- 
ing before the system test phase. In our analysis, tliis phe- 
nomenon is caused by a decrease in the coverage of unit 
and inteje^ation u^sting. In pre\ious years, R&D engineers 
independently execiiieil unit and inTegration testing of the 
functions that they impleniented before the system test 
phase. At present, those tests are not executed sufficient- 
ly because of the shortness of the implementation phase 
muler high tinie-to-mai ket pressure. Because of the 
decrease in test coverage, many single-function defects 
(tlefects within the range of a function, as opposed to 
comhination-function defects) remain in the softwaie at 
the start of system test (Pigure 4). Also^ our system t^st 
periods are no longer as long. We nearly exiiaust our test- 
ing time to detect single-function defects in shallow soft- 
ware ai*eas, antl we often don't reach tlie combuiation- 
fimction delects deep within the software. TMs makes 
it less likely that we will get convergence of the defect 
cun^c in the limited system test phase f Figure 5). 

Increase of mainteftance and enhancement costs. 

For oui^ niekLsiueineiU instninieiits* we need to enliance 
tlie fmictionahty continuously to satisfy customers' re- 
quirements even after shipment. In recent prothicts, 
the enhancement and maintenance cost is increiising 
(Figiire 6). This cost consists of work for the addition of 
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Figure 4 

Change in the propartion of singie^func^on defects found 
in i^e system test phase. 
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new functions, the testing of new modified fnncrions, and 
so on. In oiu' anidysis, this phenomenon dccui"s for the 
f<:) II owing reasons. First, we often begin to implement 
functions wlien tiie detailed specifications are still vague 
and die relationships of funcUcjns aiT still not clear. 
Second, specifications cim change to satisfy customer 
needs even in the implementation phase. Tlius, we may 
have to ini[)lement fmictioiis that aj*e only slightly different 
from aheaciy existing functions, thereby increasing the 
number of functions and pushing the cost up. Figure 7 
shows that the number of functions increases from one 
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Defect curves for post- 1 $95 products. 
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Figure 6 

Incrmse in tf?e cost per function of enhancement and 
maintenancB. The first enhancements for Product B 
occurred m 1991 
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product to another even though the two products are 
almost the same. 

Often the internal software structure is not suitable for a 
paiticular eiiiitu icemen L This can lesult froni vague func- 
tion defmltion in the design phase, which can make the 
software structure inconsistent and not strictly defined. 
In the case of our combination network and spectnmi 
analyze j^s, we didn't always examine all the relationships 
ainong analyzer modes and the measuit^ment atxd anal^-zer 
functions fe.g. different dis|)1ay formats for network and 
spectrum measurement modes). 



Figure 7 

fncrease in the number of commands in two similar 
anaiyiers ass result of changing customer needs. 
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Natiu'ally. the enhiincenient process intensely disturbs .soft- 
ware internal Ktructmes, whicli Ibrres iis to go through 
the same processes repeatcHlly ai^d detect and fix many 
additional side-effect defeds. 

Counter measures "I '^ 

If we had enough development time, our problems would 
be solved. How'ever, long development periods ait* no 
longer t>ossibIe in om' compc^titive marketplace. Therefore, 
we have iiiiproved the development process Lip.stream to 
handle these problems. We have set up two new check- 
points in the development process schedule to make sure 
Ihar improvement is steady (Figure 8). hi tJiis section we 
describe the improvements. 

We plan to apply these improvemeiit activities in actual 
projects over a three-year span. The software quahty 
assurance depaTtmeiit (SWQA) will appropriately revise 
this plan mid improve it based ou experience with actual 
projects. 

Design Phase — Improvement of Function Definition. We 

have improvt^d function dcfiniiinn to ensure sufficient 
investigation of functions and sufficient testing to remove 
single-lunction defects early in the development phase. 



Figure S 

Improved software development process. 
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We concisely describe each function's effects, range of 
parameters, mininuim ingimient resolution, related func- 
tions, and so on in tlie function definition (Figure 9j. 
Using tMs function definition, we can prevent duplicate 
or similar functions and design the relationsliips of the 
measurement modes and futictions precisely. In addition, 
we can clearly define ftmctions correspcjnding to the 
product specifications and clearly check the subordinate 
functions, so tliat we ran design a simple and consistent 
internal software stmctme. We can also easily write the 
test scripts for the automatic tests, since all of the neees- 
sary information is in the function defmitions. 

SWQA, not R&D J has ownership of the template for fimc- 
tion dt^finition. SWQA niimages and standardizes this 
template to prevent quality deterioration and eiLSure that 
improvements that have good effects are carried on to 
future projects. 

Checkpoint at the End of the Design Phase. Tiie first 
new checki^oint in tlie tievclopmeni pioccss is at the end 
of tJie design phase. SWQA conlinns that all necessaiy 
information is coniiiined in the fimction [lefmitions. SWQA 
api^roves the ftmction definitions before the project goes 
on to the impliHiientation phas(\ 

Impiementatiofi Phase— Automatrc Test Exectition, In 

this phase, SWQA mainly writes test scripts based on the 
function defkiitions for automatic teste to detect single- 
function defects. We use equivalence partitioning and 
boundary value analysis to design test scripts. As for 
comb hiation-f unction defects, since the number of combi- 
nations is almost infinite, we v^Tite test scripts based only 
on the contc^nt of the tiuu lion definitions. Wlien we im- 
plement tlie functions, we immediately execute the auto 
matic tests by using the scripts con'esponding to these 
funciions. Thus, we confinn the quality of the software as 
soon as possible. For functions already tested, wv re- 
execute the automatic tests periodically and chec:k for 
side effects caused by new^ function implementations. As 
a resuh of these improvements, we obtain softw^are \\ith 
no sin gle-f miction defects before the system test phase, 
thereby keeping the softw ai'e qualit^^ high in spite of the 
shoil development penod. Tiie test scripts me also used 
in regression testmg al\er shiiiment to confnm tlie quality 
of modified software bx the enhancement process. In this 
way, we can reduce maintenance and enhancement costs. 
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Figured 

An example of the improvement in function definmon. (b) Before improvement fbj After mprovement 
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Checkpoint at the End of the Implementation Phase. At 

(ht' second new clu'ckpohil iji l.he developmeni process, 
SWQA confimis tJiat the lest scripts reflect all the content 
of the function clefinitions, and that there are no signifi- 
cant problems in the test results. The project cannot go 
on to the system test phiise witlujut this confiimation. 

System Test Phase— Redefinition of System Testing. 

hi ail ideal testing process, we can finish system testhig 
when vvf^ have executed all of the test items in the test 
causes we have written. Howevt^r, if many single-function 
defects are left in tJie software at the start of syslem test, 
we will detect sing!e-f unction ai\d comboiaticm-f unction 
defects smiultaneously, and the end of testing win become 
imcleai*. Therefore we use statisti{-al methods, such as 
convergence of tlie defect c^urve, to decide when to end 
the system test phase, 

In our improved process, we can start the system test 
phase with high-quality code that includes only a few 
single-fmiction defects. Thus, w^e can redefine Qie testing 
nu^thod to get more tiTiciency in detecting tlie remaining 
defects. We divide the system test itenis into two test 
groups. Tlie first group uses black box testing. We write 
these test cases based tm the instrument chaiac^teiistics 
as a system and cjh common failures that have already 
been detected in the preceding series products. The 
second group is measurement application testings w^hicli 
is known as w^hite box testing. Thc^ R&D designers, who 
clearly knt)w t he measurement sequence, test the mea- 
surement api)Iications according to each instniment s 
specifications. We tr>^ to decide the end of system test 
based on the completion of test items in the test cases 
w^ritten by R&D and SWQA. We try not to depend on 
statistical methods. 

Checkpoint at the End of the System Test Phase. We use 

this checkpoint as in the previous process, as an audit 
point to exit the system test phase. SWQA confirms the 
execution of all test items and results. 

A Feasibility Study of Aytomatic Test 

Before implementing the improved development process 
desciibed above, we wanted to tmdeistaud what kind of 
function is most hkely to cause defects and which parts 
we can't test automatically. Therefore, we analyzed ai\d 
summarised the defect reports from a pj'e\ions product 
series (fn e products). We fomtd that the front-panel keys, 
the HP-IB remote control functionSt and the histrument 



BASIC language are most likely to cause defects. We also 
fjl>serV'ed thai tlie front-panel keys and the display are 
difficult to test automatic; ally. Based on this study, w^e 
knew wliich part3 of the functions needed to be written 
clearly on the function definitions, and we edited the 1 est 
items iuid checklist to make the system test more efficient. 

Application of the Improvemeni Process 
Project Y. Product Y is an extension and re\ision of Prod- 
uct X, a combination network^ spectrmn, and unpedance 
analyzer. The main purpose of Project Y was to change 
the CRT display to a TFT (thin-film transistor) display and 
the FfP-lB printer' driver to a jjaraMel printer driver. Most 
of the ftmctions of tiie analyze^' were not changed. 

Since Product Y is a re\'ision product, we didn't have to 
write new function definitions for the tlP-lB commands. 
Instead, we used the function reference manual, which 
has the closest iid'onnation to a function definition. Tiie 
main purpose of the test script was to confirm Uiat each 
command worked without fail We also tested some com- 
bination cases (e.g., testing each command with different 
cliaiim^Ls), The test script required severi wec^ks to write. 
The total number of lines is 20441. 

Foi' the automatic tests, we analyzed the defect reports 
from five simihir products and selected the ones related 
to the func^tions that are also available in Product Y (391 
defect reports in the system phase). Then we identified 
the ones tliat could be tested automatically. The result 
was 140 reports, which is about 40% oj' the total. The 
whole proc;ess took thiee weeks to finish tmd the test 
script contains 1972 Unes. The rest of tlie defect reports 
were checked manuaUy after tlie end of system test, 
it took about seven hours to fuush this check. 

BoUi of the above test scripts were written lor an in-house 
testing tool developed by the HP Santa 1 'iara Division.'^ 
An external controller (workstation) transfers the 
command to the instrument in ASCII foini, receives the 
response, and decides if the test result trasses or fails. 

Instrument BASIC (IBASIC), the mtenuil instnmient con- 
trol language, has n\any differeiu l'm\ctions. It comes with 
a suite of 295 test programs, which we executed automati- 
ciilly using a worl<stalion. The workstation do\^iik>aded 
each test program to the* instrument , ran the program, and 
saved the result. When all the programs finished running, 
we checked IT the result was pass or fail 
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For all of the automatic testing, we used the UNIX^ make 
command lo manage the lest sc-ripis. The make coniniand 
let each test pn>grani execute sequentially. 

Using the test scripts^ we needed only half a day lo test all 
of the iiPIB commands and one day to test the IBASIC. 
Since Product Y is a revision product, we also used the 
test scripts to test its predecessor. Product X, to confirm 
that Product V is compatible with Product X. The test 
items in the Product X checklist were easily modified to 
test Product Y. 

Project Z. Product Z belongs to the same pn>duct series 
iis Product V (a comtjination network, spectnim. and 
impedance analyzer). The reuse rate of source code is 
77% of Product \; 

One R&D engtneer took one month fo finish the first draft 
of the fimction definitions. To test the individual HP-IB 
commands, since the necessaiy fuiictiori definition infor- 
mation existed, we easily modified the test script for 
Product Y to test Product Z. We employed a third-party 
engineer to wiite the test scripts- Tliis took five weeks. 

Since Product Z is in the same series as Product Y. we aie 
reusing the test stTipts for Product V and adtltng i he new 
test scripts correspoiTding to the new defects that were 
detected in Product Y to test Product Z. 

Tlie IflASIC is th(^ siuni* ils Prodticf Ys, so wt^ use the s^mie 
test program for Product Z. The automatic test environ- 
ment is also the same as for Product Y. 

Since Product Z Is sUil imder development, we don't have 
the final results yet. We lise the lest sciipts to confirm the 
indi\1clual HP-H? comm;mds periodically. This eixsures that 
tile quahty of the instntiuent's software doesu'i tlegrade 
as new functions are added. At this writing, we haven't 
started system test, but w^e plan to reuse the same product 
series checklist to test Product Z. 

Results 

Project Y. Tn this project, we found 22 mistakes in the 
mamiai, 66 tlefects iu Product X while preparing the test 
scripts, and 53 defects in Product Y tkiru^g system test. 
Tlie following table lists the total time sjjent on testhig 
and the uitmhers of defects that were detected in Prtjduc t. 
X in Project X and Project Y* 



1 

Table 1 


> 

1 


Dffpdsfaunti ht Pmdmi X 




Project X 


Project Y 


Tiling Tune (houiB) 1 049 


200 


Number of Defects 309 


88 



According to this data, tising the lest scripts b^ised on the 

function reference manual, we detected SS defects in 
Product X during Project Y. even though we had tilready 
invested more than lOOt) test lioiu"s in Project X and the 
defect cinrve had already converged (Figure 3). We con- 
ckide that testing the software vtith a test script iitcreases 
the ability to detect defects. Also we see that a fimcrion 
definition is indispensable for wanting a good test script. 

Since the automatic test is executed periodically during 
the miplementation phase, we can assume that no single- 
function defects remained in Product Ts fiimware before 
system test. Since Product Y is a revision product, there 
were only a few software modifications, and we could 
asstyne thai the test items for the system testing covered 
all Uie modified cases. Therefore, we cotild make a deci- 
sion to stop the system test when ail the test items were 
completed, even though the defect curve had not con- 
verged (Figure 10). However, for a t>l9tfonu product or 
an extensicju product that Ivds many software niodili ca- 
tions and much new code, the test items of the system 
test are probably not complete enough to make this deci- 
sion, and we will still have to use tlie t^onvergence of the 
defcH'i cu!%c to detnde the end of the system test, Never- 
thelr*ss, it will always be our goal to make tjie test items 
of tlu^ system test complete enough that we can make 
decisions in the future as we did in Project Y. 

The test script is being used tot regression testing during 
enhancement of ProdiK*t Y to prevent the side effects 
caused by software modifications. 

In Figure 11, we compare the test time and the average 
defect tletectifHT time for these two jirojecls. Because 
Product Y is an exteitsion of Product X, the results ^u'e 
not exactly comparable, but using the test script appears 
to be better because it didn\ take as much fime to detect 
the average defect. 
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Figure 10 




Defect curve for Project K 
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Wo n€*pded time to write the lost scripts, but the system 
tost phase became shofler, so the total development time 
w::is shorter for Project Y. The enhancement cost wHI be 
lower because we can reuse the same tost script for 
regression testing. 



Project Z. We expect that the quality of Product Z will be 
higli heiore system test becatLSt* we test Product Z periodi- 
caUy in the implemeniation phase and coiillnn Uie result 
before entering system test. 

The additional work of the improvement process is to 
write formal fmiction defmitions and tost scripts. Since 
this project is the frrst to 1 equire a fornud function clefini- 
tion, it took the H&l) engineer one month to fuiish the 
first draft. For the next project, we expect that the fmic- 
tion definition can be mostly reused, so the time needed 
to write it will be shorten 

The test scripts are writtoii during the implementation 
phase and do not affect tlie progress of the project. Tliere- 
fore, we only need to wait about a month for w^riting the 
function definition before stattiiig the implememation 
phase, and sincre the time needed for system test will be 
shorter, the whole develojument process will be faster. 

Since we are reusing the test scripts of Product Y, tht* 
time for writing test scripts for Product Z is two weeks 
shorter th^m for Product Y. Tlius, for- a series product, w^e 
can rt*use the test scripts to make tlio prot^ess faster. Also, 
making test scripts is not a complicated job, so a third- 
party engineer can do it properly. 



Figure 11 

Cost of software testing for Projects X and K (a) Enginaer-montfis spent on software testing, (bj Engmeer-months per defect 
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Conclusion 



We anal>"zed the software (firmware) development prob- 
lems of the Hewlett-Packard Kobe Instrument Dhision 
and decided on an iniprovement process to soh'e these 
problems. Tliis iiiipro%'einent process has been applied to 
nvo projects: Project Y and Project Z, The results show 
that we caJi expect the new process to keep the software 
qualit^^ high vn\h a short development period. Tlie main 
problems— deteriorating software quality and increasing 
enhancement cost— have been reduced. 

This improvement process will be standardized and ap- 
plied tn other new projects. U will also make our software 
development process conform to the key x>rocpss areas of 
CMM (Capability Maturity Model) level 2 and some pait of 
level 2^^ 
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A Compiler for HP VEE 



Steven Greenbaum 



Stanley Jefferson 



With the addition of a compiler, HP VEE programs can now benefit from 
improved execution speed and still provide the advantages of an interactive 
interpreter. 
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-his aitlcle presents the majt>r algorithniic aspects of a compiler for the 
Hewlett-Packard Visual Engineering Emit^onment (HP V^K). HP VEE is a 
powerful visual piograinming language that Riniplifles the df?velopment of 
engineering test-and-TneasurenK^nt soft ware. In the HP VEE development 
environment, engineers design programs by linking visual objects (also called 
devices) into block diagrams. A siinijk* (^xaru]jle is shown in Figure 1. 
Features provided in HP V'EE include; 

■ Suppoit for engineering math and giapliics 

■ Instrument control 

■ Concurrency 

■ Data management 

■ GUI support 

■ Test sequencing 

■ Interactive development and debugging environment, 

Begimiing with release 4,0, HP \^E uses a compiler to ijnprove the execution 
speed of programs. 11ie compiler translates an IIP VEE program into 
byte-code tliat is executed by an efficient mterpreter embedded in HP VEE. By 
analyxingthe control structures and data 1:>pc use of an HP \^E program, die 
compiler detemuues the evaluation order of devices, eliminates unnecessary 
inn -time decisions, and uses appropriate data stnic tores. 

The HP VBE 4.0 compiler increases thp- performance of computation-intensive 
programs by about 40 times over previotis versions of HP VEE. In applications 
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Figure I 

4 simple MP VEE program to compute ^e area of a cmle. 
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where execution speed is constrained by instruments, file 
input and output, or display update, performance typically 
increases by 150 to 400 percent. 

The compiler described in this article is a prototype de%'el- 
oped by HP Laboratories to compile HP YEE 3.2 pro- 
grams. The compiler in HP VEE 4.0 differs in some de- 
tails. The HP VT!E prototype compiler consists of five 
components: 

■ Graph Transfonnation. Ti^ansfbrmations are performed 
on a graph !e]jr(vscntalion of thi* HP VEE prognmi. The 
I ransformations iacilitate futtire compilation phases. 

■ Device Scheduling. An execution ordering {>f devices 
is obtiiined. Tlie ordering may have liic^nirchicaJ ele- 
ments, .such tis iterators t that are recm^ively ordered. 
The ordering preserv^es the data flow aiid control flow 
relationships among devices in the HP VEE program. 
Scheduling does not. however, represenl the run-linie 
How branching beha\ior of special devices such d^ 
If/Then/Else, 

■ Guaj'd Assigimaent. The structure produced by schedul- 
hig is extended with coitstnicts that represent run-time 
now branching. Each de\1ce is annotated with boolean 
guai'ds thai represent conditions tha! must be satisfied 
at run 1 inie for the device to run. Adjacent devices with 
sijnilar guards are grouped together to decre^ise redun- 
dancy of nm-time guard j)rocessing. (tuards vdn resuh 



from explicit HP VEE branching constructs such as 
if/The n/E[se, or they can result from implicit properties 
of other devices, sucfi as guards that indicate w^hether 
an iterator has run at least once, 

■ Tvpe Aimotation. Devices are aiinotated with type infor- 
mation that gives a cnnser\' ative analysis of what types 
of data are input to, and output from, a device. The an- 
notations caji be used to generate type-specific code. 

■ Code Generation. Tiie data structures maintained by the 
comx)ikT are travefsed to generate^ taiget code. The 
prototype compiler can generate C code and byte-code. 
However, code generation Ls relatively straightforward to 
implement for most target languages. 

Tliese components combine to implement the semjintics 
explicitly and implicitly specified in an HP VEE program. 
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This complete article can be found at: 

h 1 1 p : // w w w. h j>. com/lipj/98m ay/mal)8a 1 3. htm 

.More information about HP \^E can be foimd at: 
lit tp:/Mw^^ hp, com/go/H PVEE 
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